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Navigating  the  COTS  Sea 


As  I  scratch  my  head  for  the  thousandth  time  wondering  who  came  up  with  the 
bright  idea  of  the  standard  desktop  computer,  it  occurs  to  me  that  this  month’s 
Crosstalk  theme  is  extremely  pertinent.  I  am  sure  that  the  promoters  of  the  stan¬ 
dard  desktop  did  not  consider  that  a  software  maintenance  group  might  need  multiple 
versions  of  a  single  software  package  loaded  onto  the  same  desktop.  Nor  did  they 
imagine  that  continually  pushing  out  patches  would  create  a  configuration  management 
nightmare  in  our  software  integration  laboratories. 

Commercial  off-the-shelf  (COTS)  software  solutions  have  long  been  touted  as  the  best-  or 
least-cost  solution  to  many  software  design  requirements.  We  save  major  development  costs  by 
using  commercial  products  and,  in  theory,  can  significantly  accelerate  the  delivery  schedule.  The 
Air  Force  is  engaging  in  this  venture  on  a  massive  scale  with  the  implementation  of 
Expeditionary  Combat  Support  System  (ECSS)  -  based  on  a  commercially  available  Enterprise 
Resource  Planning  system.  The  projected  life-cycle  cost  savings  due  to  phasing  out  legacy  infor¬ 
mation  systems  are  staggering.  However,  the  success  of  ECSS,  and  of  aU  COTS  software  imple¬ 
mentations,  is  dependent  on  a  realistic  assessment  of  all  costs,  benefits,  and  risks. 

While  I  have  never  led  a  successful  COTS  implementation.  I’ve  wimessed  quite  a  few  fail¬ 
ures  and  participated  in  many  re -vectoring  efforts.  The  following  is  some  advice  for  anyone 
starting  down  that  road: 

1.  Remember  the  Titanic.  Don’t  just  look  at  the  tip  of  the  iceberg,  there’s  plenty  hidden 
beneath  the  water  to  help  sink  the  ship.  Look  at  long-term  license  and  maintenance  fees.  Check 
interfaces  to  see  how  many  wiU  have  to  be  changed  if  you  field  a  COTS  solution.  Expect  inter¬ 
face  control  documents  to  have  discrepancies.  Ensure  that  you  have  sufficient  data  rights  to 
maintain  the  product  once  it  is  fielded. 

2.  Caveat  emptor:  Buyer  beware!  Do  your  research,  define  your  requirements,  establish 
acceptance  criteria,  read  the  fine  print  and  remember  that  the  vendor  is  selling  a  product  and  is 
rewarded  based  on  the  number  of  sales.  Ask  for  references  from  other  customers  who  have  suc¬ 
cessfully  converted  to  the  product. 

3.  Resist  the  temptation  to  modify  COTS.  If  you  start  talking  about  GOTS  (government 
off-the-shelf)  and  MOTS  (modified  off-the-shelf),  then  you’ve  lost  the  bubble  and  should  seek 
professional  help.  This  isn’t  Burger  King.  If  you  go  with  COTS,  you  can’t  have  it  your  way. 

This  month’s  articles  illustrate  both  the  drawbacks  and  the  benefits  of  using  COTS  software 
in  the  development  and  sustainment  of  Department  of  Defense  weapon  systems.  In  Added 
Sources  of  Costs  in  Maintaining  COTS -Intensive  Systems,  Dr.  Betsy  Clark  and  Dr.  Brad  Clark  capture 
the  many  frustrations  expressed  by  project  managers  and  team  leads  in  communicating  to  upper 
management  the  reasons  why  COTS-based  systems  are  so  expensive  to  maintain.  Dr  David  A. 
Cook  addresses  Issues  to  Consider  Before  Acquiring  COTS,  in  particular  the  problems  of  trying  to 
integrate  multiple  COTS  applications.  Essential  to  success  is  the  need  to  apply  basic  software 
engineering  principles  and  beware  of  marketing  hype. 

At  the  402  Software  Maintenance  Group  at  Robins  AFB,  we  continue  to  see  benefits  from 
using  COTS  software,  particularly  in  our  integration  environments  if  used  wisely.  In  Tean  AISF: 
Applying  COTS  to  System  Integration  Facilities,  Harold  Lowery  emphasizes  the  importance  of  weigh¬ 
ing  aU  alternatives  and  clearly  defining  trade-offs  when  considering  make-versus-buy  decisions. 
Additionally,  in  GF  Studio  Brings  Realism  to  Aircraft  Cockpit  Simulator  Displays,  Kim  Stults  demon¬ 
strates  that  with  sufficient  pre-planning,  COTS  products  can  be  a  viable  means  of  upgrading 
systems  saving  both  time  and  budget.  We  conclude  the  theme  articles  with  a  discussion  of  using 
JAVA  with  real-time  systems  in  Applying  COTS  Java  Benefits  to  Mission-Critical  Real-Time  Software  by 
Dr.  Kelvin  Nilsen. 

The  debate  on  the  benefits  and  challenges  of  COTS  is  not  likely  to  be  settled  soon.  Most 
things  in  life  have  their  benefits  and  drawbacks;  as  with  life,  we  must  keep  an  open  mind  when 
considering  COTS. 


Diane  E.  Suchan 

Warner  Kobins  Air  ^Logistics  Center  Co-Sponsor 
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Added  Sources  of  Costs  in 
Maintaining  COTS-Intensive  Systems 


Dr.  Betsy  Clark  and  Dr.  Brad  Clark 
Software  Metrics  Inc. 

Ten  years  ago,  work  was  begun  at  the  Center for  Systems  and  Software  Engineering  at  the  University  of  Southern  California 
to  develop  a  cost  model for  commercial  off-the-shelf  (COTS)-hased  software  systems'.  A.  series  of  interviews  were  conducted 
to  collect  data  to  calibrate  this  modeC.  A  total  of  25  project  managers  were  interviewed;  for  eight  of  these  projects,  data  was 
collected  during  the  original  system  development  and  maintenance  phases.  A  common  sentiment  heard  from  the  people  main¬ 
taining  these  systems  was  that  they  turned  out  to  be  more  expensive  to  maintain  than  origjinally  envisioned  and,  in  fact,  were 
more  costly  than  a  comparable  custom-built  system.  At  the  same  time,  several  people  expressed frustration  about  the  difficul¬ 
ty  of  communicating  to  upper  management  the  reasons  why  COTS-based  systems  were  so  expensive  to  maintain.  Anecdotal 
evidence  from  these  interviews  is  used  to  discuss  the  added  sources  of  maintenance  cost.  Three  different  approaches  or  strate¬ 
gies  for  system  maintenance  were  observed  and  are  summarised  in  this  article. 


The  past  10  to  15  years  have  seen  a 
strong  push  within  the  Department  of 
Defense  and  other  government  agencies 
toward  the  use  of  COTS  software  prod¬ 
ucts  in  system  acquisition.  On  the  surface, 
this  makes  a  lot  of  sense  -  why  build 
something  from  scratch  that  already  exists, 
especially  if  it  is  a  mature  product?  In  fact, 
with  the  increasing  complexity  of  today’s 
systems,  a  total  custom  system  is  no 
longer  practical.  With  the  continued  use  of 
COTS  components  in  building  systems,  it 
is  a  worthwhile  objective  to  identify 
sources  of  cost  and  approaches  to  manag¬ 
ing  them.  It  is  the  intent  of  this  article  to 
identify  these  sources. 

As  part  of  an  effort  to  collect  data  to 
calibrate  a  cost  model  for  systems  consist¬ 
ing  of  COTS  components,  a  number  of 
interviews  were  conducted  with  project 
managers,  team  leads,  and  other  project 
members  maintaining  COTS-intensive 
systems.  People  consistently  told  us  that 
these  systems  were  more  expensive  to 
maintain  than  originally  estimated  and,  in 
fact,  were  more  costly  than  a  comparable 
custom-built  system  to  maintain.  At  the 
same  time,  we  heard  frustration  expressed 
about  the  difficulty  of  communicating  to 
upper  management  the  reasons  why 
COTS-based  systems  are  so  expensive  to 
maintain.  If  COTS-based  systems  really 
are  more  costly  to  maintain,  what  are  these 
additional  costs?  Are  there  strategies  for 
managing  or  minimizing  them?  These 
questions  are  addressed  in  this  articlel 
Before  proceeding  to  answer  these 
questions,  we  need  to  define  some  terms. 
A  COTS  software  component  is  defined 
as  it  was  used  in  the  COCOTS  model  [2]. 
This  definition  contains  the  following  four 
parts: 

1 .  A  COTS  component  is  sold,  leased,  or 
licensed  for  a  fee  (which  includes  ven¬ 


dor  support  in  fixing  defects  if  they 
are  found). 

2.  The  source  code  is  unavailable. 

3.  The  component  evolves  over  time  as 
the  vendor  provides  periodic  releases 
of  the  product  (upgrades)  containing 
fixes  and  new  or  enhanced  functional¬ 
ity. 

4.  Any  given  version  of  a  COTS  compo¬ 
nent  win  reach  eventual  obsolescence 
or  end  of  life  in  which  it  will  no  longer 
be  supported  by  the  vendor. 

AH  four  parts  of  this  definition  have 
major  implications  for  the  added  costs  of 
maintaining  COTS-intensive  systems 
(compared  to  comparable  custom-devel¬ 
oped  systems). 

At  any  given  time  for  any  given  com¬ 
ponent,  there  is  a  choice  between  upgrad¬ 
ing  to  the  next  version  or  doing  nothing. 
There  are  risks  inherent  in  either  strategy. 
If  the  first  choice  is  made  to  upgrade  to  a 
new  version,  there  may  be  unintended 
interactions  with  other  components,  there 
may  be  defects  introduced  as  well  as 
unneeded  functionality.  These  types  of 
impacts  are  discussed  in  more  detail  in  this 
article. 

If  the  second  choice  is  made  to  do 
nothing,  the  component  will  eventually 
reach  end-of-Ufe  and  will  no  longer  be 
supported  by  the  vendor.  If  a  problem 
with  the  component  surfaces  at  this  point 
in  time,  the  vendor  will  not  fix  it  and  the 
system  maintenance  team  cannot  do  much 
because  they  do  not  have  access  to  the 
source  code. 

Also,  before  proceeding,  it  is  impor¬ 
tant  to  clarify  the  type  of  projects  the 
interviewed  managers  were  involved  in. 
The  Software  Engineering  Institute  (SEI) 
has  made  a  distinction  between  COTS- 
solution  and  COTS-intensive  systems  [3]. 
COTS-solution  systems  are  the  typical 


business  or  standard  information  technol¬ 
ogy  systems  that  are  comprised  of  large 
application  COTS  products.  Examples 
include  Enterprise  Resource  Planning 
applications,  human  resource,  and  finan¬ 
cial  systems.  The  major  COTS  component 
is  essentially  the  system.  It  provides  a  user 
interface,  has  its  own  architecture,  and  has 
internal  business  logic  that  must  be  fol¬ 
lowed  to  be  used.  On  the  other  hand, 
COTS-intensive  systems  are  comprised  of 
many  COTS  components.  In  these  sys¬ 
tems,  no  single  component  is  king.  There 
may  be  many  components  that  handle 
user  interface,  data  transmission  and  stor¬ 
age,  and  data  manipulation  and  transfor¬ 
mation.  These  components  interact  with 
each  other  through  custom-developed  glue 
code  using  vendor-provided  application 
program  interfaces  and  with  custom- 
developed  application  code.  The  business 
logic  is  spread  across  components  and  is 
guided  by  the  way  the  components  are 
used. 

The  systems  that  predominantly  made 
up  our  sample  are  mission-critical  systems 
with  high  reliability  and  performance 
requirements  and  would  be  classified  as 
COTS-intensive.  A  number  of  our  pro¬ 
jects  were  air-traffic  control  systems;  we 
also  had  ground  control  systems  for  mis¬ 
sile  launches  and  two  ground  control  sys¬ 
tems  for  satellites.  In  addition  to  the  high 
performance  and  reliability  requirements, 
these  systems  typically  had  a  large  amount 
of  custom  application  code  along  with  a 
large  number  of  COTS  products 
(between  10  and  50  was  typical). 

Two  additional  points  are  worth  bring¬ 
ing  up  before  discussing  the  specific 
sources  of  added  costs  for  these  COTS- 
intensive  systems.  We  deliberately  chose 
the  term  maintenance  rather  than  the  more 
commonly  used  term  sustainment  because. 
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like  hardware,  a  COTS-intensive  system 
wiU,  in  effect,  degrade  without  dollars  and 
effort  spent  to  manage  the  impact  of  mul¬ 
tiple  components  evolving  over  time.  This 
is  the  central  thesis  of  this  article  and  is 
the  source  of  additional  cost  for  these  sys¬ 
tems. 

We  do  not  want  to  leave  the  impres¬ 
sion  that  we  are  against  the  use  of  COTS 
components.  Given  the  complexity  of 
many  of  today’s  systems,  total  custom 
development  is  no  longer  feasible.  In  addi¬ 
tion,  the  use  of  COTS  components  allows 
system  developers  to  take  advantage  of 
the  best  that  the  marketplace  has  to  offer 
and  removes  the  unnecessary  reinvention  of 
the  wheel  seen  prior  to  the  widespread  use 
of  COTS  components.  Our  objective  is  to 
help  people  anticipate  and  manage  the 
added  sources  of  costs  in  maintaining 
COTS-intensive  systems. 

Major  Sources  of  Added  Costs 
in  Maintaining  Systems  With 
COTS  Software  Components 

This  section  discusses  the  factors  that 
were  found  to  impact  the  cost  of  main¬ 
taining  COTS-intensive  systems.  Each  of 
the  following  factors  is  compared  to  cus¬ 
tom  developed  systems. 

Licensing 

The  most  obvious  additional  cost  burden 
is  component  licensing  fees.  Fees  can 
range  from  a  one-time  fee  to  yearly  renew¬ 
al.  The  license  may  be  enterprise-wide, 
site-specific,  or  per  seat  (one  computer). 
With  one  exception,  licensing  fees  did  not 
cause  concern  among  the  project  members 
interviewed,  presumably  because  this  was 
an  expected,  known  life-cycle  cost.  The 
one  exception  occurred  for  a  COTS-solu- 
tion  system  that  was  used  on  a  pilot  basis 
at  one  location.  Following  a  successful 
pilot,  the  decision  was  made  to  deploy  the 
system  worldwide  which  would  entail  hun¬ 
dreds  of  sites.  Much  to  the  surprise  of  the 
project  manager,  the  per  site  fee  was 
increased  by  the  vendor.  At  the  time  of  the 
interview,  he  indicated  that  he  assumed 
that  they  would  get  a  quantity  discount. 
The  price  per  copy  was  actually  going  up. 
The  increase  in  price  was  so  great  that  he 
was  seriously  considering  starting  over, 
this  time  writing  the  system  themselves 
from  scratch.  There  are  no  comparable 
fees  for  custom-developed  systems. 

There  is  effort  required  in  tracking 
licensing  requirements  to  ensure  that 
renewals  are  paid.  With  different  types  of 
licensing  and  support  agreements  across 
different  COTS  components  and  vendors, 
this  tracking  can  become  an  administrative 


burden.  There  is  no  comparable  effort 
required  for  custom  systems. 

Evaluation  of  New  Releases 

A  major  source  of  cost  stems  from  COTS 
component  volatility. 

Volatility  in  this  case  means  the  fre¬ 
quency  with  which  vendors  release 
new  versions  of  their  products  and 
the  significance  of  the  changes  in 
those  new  versions,  i.e.  minor  up¬ 
grades  versus  major  new  releases.  [4] 

In  contrast  to  custom-developed  code, 
a  COTS  software  component  is  controlled 
by  the  vendor.  The  timing  and  content  of 
releases  is  at  the  discretion  of  the  vendor. 
Major  effort  may  be  required  to  evaluate 
and  understand  the  implications  of 
upgrading  to  a  new  component  or  per¬ 
haps  switching  to  a  whole  new  product 
entirely. 

**Evaluation  activities 
require  a  test  bed  that 
can  replicate  all  deployed 
system  configurations  of 
hardware  and  software/* 

COTS  software  component  evaluation 
addresses  the  following  questions: 

1.  Are  there  interactions  with  other  parts 
of  the  system? 

2.  Are  there  any  performance  impacts  on 
the  system  as  a  whole? 

3.  WiU  we  need  to  rewrite  glue  code  or 
appUcation  code? 

4.  Are  there  any  impacts  on  any  custom 
code? 

5.  Are  there  new  features  that  need  to  be 
disabled? 

6.  If  there  are  multiple  hardware  config¬ 
urations  in  the  field,  can  we  be  sure 
there  are  no  unintended  interactions 
for  any  of  these? 

7.  Should  we  continually  upgrade  as  new 
versions  appear  or  should  we  only 
upgrade  for  a  critical  fix? 

Evaluation  activities  require  a  test  bed 

that  can  repUcate  all  deployed  system  con¬ 
figurations  of  hardware  and  software.  For 
safety-critical  systems,  the  amount  of 
analysis  can  be  large  even  though  the  ulti¬ 
mate  decision  may  be  to  do  nothing.  As 
one  person  stated,  even  when  we  don’t  change 
a  version,  there  is  a  lot  of  analysis  required.  It 
can  be  difficult  to  verif  implications  with  a  black 


box.  The  need  for  this  ongoing  black-box 
evaluation  is  unique  to  systems  with 
COTS  components. 

Defect  Hunting 

Defects  appear  to  be  more  problematic 
for  COTS-intensive  systems  than  with 
custom  code.  After  documenting  and  con¬ 
firming  the  existence  of  a  defect,  the  next 
step  is  finding  the  source  within  the  sys¬ 
tem.  Projects  reported  that  it  can  be  much 
more  difficult  with  a  COTS-based  system 
to  pinpoint  the  source  of  a  problem.  It 
can  be  difficult  to  know  whether  a  defect 
is  coming  from  a  COTS  component  or 
from  other  custom  developed  code.  We 
heard  of  finger-pointing  situations  in 
which  a  defect  was  in  a  COTS  product, 
but  the  vendor  was  unable  to  replicate  it 
because  they  did  not  have  the  same  hard¬ 
ware  configuration.  (Incidentally,  this  is  a 
problem  that  may  occur  during  develop¬ 
ment  as  well  as  at  any  point  in  the  life 
cycle.)  AH  of  this  can  take  time  and  effort, 
translating  into  additional  costs. 

With  a  custom  system,  one  can  see 
inside  the  box.  Debugging  can  follow  the 
path  through  the  code  without  running 
into  component  boundaries.  This  elimi¬ 
nates  finger  pointing. 

Vendor  Support 

Vendor  support  is  often  used  in  mainte¬ 
nance  to  fix  defects  quickly,  provide  assis¬ 
tance  with  the  latest  product  upgrades,  or 
make  adjustments  to  the  COTS  compo¬ 
nent  in  the  presence  of  other  product 
upgrades.  The  support  may  range  from 
24/7  call  service  to  dedicated  on-site 
staffing.  If  a  defect  is  found  and  it  looks 
like  the  source  is  a  COTS  component,  it  is 
the  vendor  who  must  fix  the  problem 
(provided  the  vendor  agrees  their  product 
has  the  problem  -  as  noted  above,  this  res¬ 
olution  can  take  a  lot  of  time  and  effort). 
A  variety  of  contractual  mechanisms  can 
be  in  place  to  guarantee  24/7  support  and 
immediate  fixes  (this,  too,  can  be  a  quag¬ 
mire  if  the  support  is  unsatisfactory).  If 
the  latest  release  of  a  COTS  software 
component  has  new  features  or  interfaces, 
a  vendor’s  support  may  be  required  to 
integrate  a  component  into  the  current 
system.  This  support  may  include  some 
tailoring  by  the  vendor  to  get  their  com¬ 
ponent  to  cooperate  with  the  existing  sys¬ 
tem  architecture.  Finally,  if  a  vendor  has 
gone  out  of  business,  support  may  be 
unavailable.  Risk-mitigation  strategies  are 
discussed  later  that  address  this  situation. 

Upgrade  Ripple  Effect 

After  a  new  version  of  a  COTS  compo¬ 
nent  has  been  evaluated,  the  installation  of 
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the  component  into  the  system  may  have 
a  ripple  effect.  Due  to  the  new,  additional 
functionality  in  a  component,  the  system 
may  require  changes  to  custom  code,  glue 
code  between  components,  or  tailoring  of 
other  COTS  components.  In  custom- 
developed  code  maintenance,  only  the 
fixes  and  enhancements  that  are  needed 
are  implemented,  thus  minimizing  (but 
not  eliminating)  tipple  effects. 

Hardware  Upgrades 

People  found  that  upgrades  to  new  soft¬ 
ware  components  sometimes  required 
upgrades  to  new  hardware  as  well.  One 
person  noted  that  vendors  were  constant¬ 
ly  driven  to  add  functionality,  putting 
more  demands  on  the  hardware.  They 
have  not  been  able  to  upgrade  the  hard¬ 
ware  as  quickly  as  they  would  like. 

In  a  comparable  custom  maintenance 
upgrade,  hardware  performance  is  consid¬ 
ered  as  part  of  the  upgrade  activity.  With 
only  the  required  features  implemented, 
minimal  impact  to  hardware  performance 
can  be  preserved. 

Disabling  New  Features 

There  may  be  new  features  that  need  to  be 
disabled  for  security  or  performance  rea¬ 
sons.  The  added  cost  is  in  the  form  of 
additional  tailoring  of  the  COTS  compo¬ 
nent.  This  may  require  discovering  how  to 
disable  new  features  or  custom  code  writ¬ 


ten  to  hide  or  disable  the  new  features. 
Disabling  a  feature  is  not  characteristic  of 
custom  systems. 

Early  Maintenance 

Because  COTS  components  continue  to 
evolve  in  the  marketplace,  it  is  possible 
that  upgrades  may  begin  before  the  system 
is  deployed,  particularly  if  the  develop¬ 
ment  spans  several  years.  If  the  compo¬ 
nents  are  not  upgraded,  it  is  possible  that 
much  of  the  system  may  have  reached  end 
of  life  before  the  system  is  even  delivered. 
This  was  the  case  according  to  one  of  the 
project  managers  interviewed;  this  system 
had  an  application  base  totaling  more  than 
one  million  lines  of  custom  code  plus  a 
total  of  45  COTS  components.  Almost 
half  of  these  components  were  obsolete 
by  the  time  the  system  was  deployed. 

Market  Watch 

Because  COTS  vendors  can  go  out  of 
business,  a  number  of  those  interviewed 
suggested  that  a  market  watch  be  estab¬ 
lished  as  a  risk  mitigation  strategy  to  han¬ 
dle  such  an  event.  If  a  vendor  goes  out  of 
business,  either  the  component  source 
code  or  a  different  component  can  be  pur¬ 
chased.  With  custom-developed  systems, 
this  activity  is  not  required. 

Continuous  Funding 

Another  difference  between  a  COTS- 


based  and  a  custom  system  is  that  the  sys¬ 
tems  with  COTS  components  require  a 
more  stable  funding  base.  When  budgets 
get  tight,  funding  for  maintenance  is  often 
sacrificed.  With  a  custom  system, 
enhancement  can  be  delayed  until  funding 
is  obtained.  The  consequences  of  delaying 
funding  with  a  COTS-based  system  is  that 
licenses  may  lapse,  bug  fixes  and  upgrades 
become  unavailable,  or  vendors  go  out  of 
business  with  no  resources  to  exercise  the 
risk  mitigation  identified  in  a  market 
watch. 

Number  of  Components  Versus 
Maintenance  Costs 

One  consistent  comment  we  heard  is  that 
the  number  of  COTS  components  in  a 
system  has  a  strong  impact  on  mainte¬ 
nance  costs.  A  model  adapted  from  one 
proposed  by  Chris  Abts,  called  COTS-Life 
Span  Model  (LIMO)  [5],  attempts  to 
explain  this  phenomenon.  The  model, 
depicted  in  Figure  1,  shows  two  regions 
divided  into  maintenance  economies 
(overtime  costs  go  down)  and  mainte¬ 
nance  diseconomies  (overtime  costs  go 
up).  As  explained  in  COTS-LIMO,  main¬ 
tenance  costs  for  a  single  COTS  compo¬ 
nent  go  down  over  time  as  the  experience 
gained  by  system  maintainers  increases, 
thus  improving  productivity.  The  increase 
in  productivity  can  outpace  the  increased 
effort  required  to  maintain  the  system  as 
the  COTS  products  mature  and  evolve  in 
divergent  directions.  However,  there  is  a 
break-even  point  with  the  number  of 
installed  COTS  software  components,  (n), 
where  maintenance  costs  increase  dispro¬ 
portionately  to  the  number  of  COTS 
products,  regardless  of  the  efficiencies 
gained. 

Factors  that  contribute  to  the  COTS 
maintenance  diseconomies  in  Figure  1  are 
those  that  are  discussed  in  this  article.  For 
instance,  issues  raised  with  COTS  licens¬ 
ing  is  much  more  complex  with  more 
components.  A  COTS-intensive  system 
presents  multiple  licensing  strategies,  dif¬ 
ferent  renewal  periods,  and  different 
license  cost  structures.  People  reported 
that  this  can  become  an  administrative 
nightmare. 

Evaluating  the  impact  of  upgrades  is 
considerably  more  burdensome  if  there 
are  a  lot  of  components  (greater  than  n). 
The  number  of  possible  interactions 
between  components  increases  exponen¬ 
tially  as  the  number  of  components 
increases.  When  trying  to  hunt  down 
defects,  the  complex  interactions  of  many 
components  make  the  task  even  more  dif¬ 
ficult.  Configuration  management  be¬ 
comes  more  complex  when  many  compo- 


Figure  1:  COTS  Maintenance  Economies  Versus  Diseconomies 
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nents  and  configurations  exist  in  a  system. 
The  possibility  of  a  tipple  effect  is  higher 
with  the  impact  of  component  upgrades. 
There  are  more  unwanted  features  with 
more  components.  The  market  watch 
becomes  a  large-scale  activity. 

The  idea  of  this  model  was  verified 
when  we  kept  hearing  that  the  complexity 
of  maintaining  a  COTS-based  system 
increases  dramatically  as  the  number  of 
different  COTS  components  increases. 
There  is  much  more  potential  for  interac¬ 
tion  as  well  as  more  potential  upgrades 
that  have  to  be  examined. 

Three  Risk-Mitigation 
Strategies  to  Deal  With  the 
Challenges  of  Maintaining 
COTS-Intensive  Systems 

This  article  has  discussed  sources  of  addi¬ 
tional  cost  in  maintaining  COTS-intensive 
systems.  Across  the  projects  interviewed, 
three  strategies  for  dealing  with  COTS 
volatility  were  observed.  Each  of  these 
strategies  is  discussed  next 

Revert  Back  to  Source  Code 

Several  of  the  projects  interviewed  opted 
to  maintain  one  or  more  critical  COTS 
components  themselves.  In  one  case,  the 
product  (an  operating  system)  was  allowed 
to  reach  end  of  life  and  the  project  pur¬ 
chased  the  source  code  from  the  vendor. 
From  that  point  on,  they  no  longer  had 
vendor  support  but  were  able  to  make 
fixes  themselves.  This  decision  was  made 
because  it  avoided  the  necessity  for  hard¬ 
ware  upgrades.  It  removed  the  risk  of 
being  unable  to  fix  future  problems. 
Alternatively,  several  other  projects 
replaced  critical  COTS  components  with 
their  own  custom-developed  software. 

This  strategy  places  control  for  fixing 
problems  back  in  the  hands  of  the  main¬ 
tenance  organization.  A  downside  is  the 
additional  expense  (purchasing  the  source 
code  or  developing  it  from  scratch).  In  the 
case  given  dealing  with  the  operating  sys¬ 
tem,  this  strategy  was  part  of  a  larger 
strategy  to  freeze  the  hardware  configura¬ 
tion,  much  of  which  was  special  purpose, 
for  a  period  of  time  until  the  next  genera¬ 
tion  system  could  be  deployed. 

Divide  and  Conquer 

This  strategy  divides  the  COTS  software 
components  into  two  categories:  non-ctit- 
ical  and  critical.  The  non-critical  COTS 
components  are  not  upgraded.  Resources 
are  focused  on  the  set  of  critical  compo¬ 
nents.  For  these  components,  market 
watch  and  evaluation  activities  occurred 
and  the  decision  to  upgrade  was  made 


individually  for  each  critical  component. 

This  strategy  is  driven  by  the  need  to 
balance  the  ongoing  costs  required  for 
maintaining  a  COTS-intensive  system 
with  limited  resources.  The  upside  of  this 
strategy  is  that  it  saves  money  by  ignoring 
a  subset  of  components.  The  downside  of 
this  strategy  is  that  a  portion  of  the  sys¬ 
tem  remains  stagnant  and  unsupported. 

Design  for  Change 

The  third  strategy  uses  information  hiding  in 
the  form  of  wrappers  to  protect  the  sys¬ 
tem  from  unintended  negative  impacts  of 
multiple  component  upgrades.  One  inter¬ 
viewee  said  when  describing  this  strategy 
that  they  wanted  to  be  able  to  replace  a 
product  without  damage  to  the  rest  of  the 
system.  As  an  example,  they  had  a  wrap¬ 
per  around  the  database.  It  could  be  a  flat 
tile  or  relational  database  -  the  custom 

**The  sources  of  added 
costs  discussed ...  were 
identified  through 
anecdotal  evidence 
obtained  from  interviews. 
The  next  step  is  to 
quantify  these  sources 
and  the  parameters  that 
impact  each.  We 
are  looking  for 
opportunities  to  continue 
this  investigation.^* 

application  didn’t  care.  This  strategy 
requires  more  thought  and  effort  up  front. 
The  project  in  our  sample  that  used  this 
strategy  had  a  strong  project  sponsor  right 
from  the  beginning  who  argued  success¬ 
fully  for  additional  resources  to  design  for 
change  from  the  beginning.  This  was  a 
project  that  was  planned  for  a  long  life 
with  safety-critical  requirements. 

The  advantages  of  this  strategy  are 
clear:  There  is  much  more  assurance 
against  unintended  ripple  effects  from 
upgrades  or  even  product  replacement 
with  a  product  from  another  vendor.  The 
disadvantage  is  the  necessity  for  resources 
early  in  system  development  when  the  typ¬ 
ical  focus  is  on  getting  the  system 


deployed  rather  than  worrying  a  great  deal 
about  the  life-cycle  consequences  of  deci¬ 
sions. 

Concluding  Remarks 

This  article  has  discussed  the  sources  of 
additional  costs  required  to  maintain 
COTS-intensive  systems.  As  noted  in  the 
introduction,  we  do  not  want  to  leave  the 
impression  that  we  are  against  the  use  of 
COTS  components.  One  of  the  people 
interviewed  expressed  the  view  that  the 
continual  evolution  and  maturation  of 
COTS  components  is,  in  fact,  one  of  the 
real  positives  of  using  commercial  com¬ 
ponents  in  a  system. 

It  is  the  authors’  objective  to  help  peo¬ 
ple  understand  some  of  the  added  sources 
of  costs  in  maintaining  a  COTS-intensive 
system,  particularly  by  bringing  attention 
to  areas  that  may  not  have  been  anticipat¬ 
ed.  In  particular,  projects  should  under¬ 
stand  the  life-cycle  implications  of  inte¬ 
grating  a  large  number  of  COTS  compo¬ 
nents.  More  thought  should  be  given  early 
to  the  impact  of  upgrades  on  the  entire 
system,  the  reliance  on  vendors  to  fix 
problems,  and  the  strategies  that  will  be 
used  in  dealing  with  multiple  products, 
each  evolving  at  the  discretion  of  the  ven¬ 
dor.  These  concerns  become  especially 
problematic  with  high  assurance,  high  per¬ 
formance  systems. 

The  sources  of  added  costs  discussed 
in  this  article  were  identified  through 
anecdotal  evidence  obtained  from  inter¬ 
views.  The  next  step  is  to  quantify  these 
sources  and  the  parameters  that  impact 
each.  We  are  looking  for  opportunities  to 
continue  this  investigation. ♦ 
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Notes 

1.  This  model,  named  Constructive 
COTS  Model  (COCOTS),  is  one  of 


the  COCOMO  family  of  models. 

2.  The  interviews  and  model  calibration 
were  sponsored  by  the  Federal 
Aviation  Administration’s  (FAA)  Soft¬ 
ware  Engineering  Resource  Center. 
The  interviews  were  conducted  by  Dr. 
Chris  Abts  and  Dr.  Betsy  Clark. 

3.  For  a  discussion  of  lessons  learned  in 
maintaining  COTS-intensive  systems, 
see  Reifer,  et  al.  [1]. 
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Issues  to  Consider  Before  Acquiring  COTS 


Dr.  David  A.  Cook 
The  AEgis  Technologies  Group,  Inc. 

In  today’s  software  and  acquisition  environment,  the  decision  to  use  commercial  off-the-shelf  (COTS)  and  government  off- 
the-shelf  (GOTS)  is  not  only  necessary  from  a  schedule  and  cost  point  of  view,  but  often  is  mandated.  However,  there  are 
many  factors  that  influence  the  decision  to  use  and  choose  COTS/ GOTS  software.  Traders  who  have  a  background  in 
computer  science  or  who  have  taken  some  formal  software  engineering  code  development  classes  are  typically  familiar  with 
the  effects,  but  many  engineers  who  acquire  COTS  do  not  have  this  background.  This  article  discusses  some  basic  but  often- 
neglected  factors  affecting  COTS  selection  and  use. 


Back  in  the  early  days  of  computing, 
COTS  use  was  commonplace. 
However,  instead  of  integrating  with 
other  programs,  the  COTS  packages  ran 
independently.  It  was  commonplace  to 
lease  or  buy  accounting  or  other  similar 
software  from  your  computer  vendor. 
Locally  written  software  seldom,  if  ever, 
interfaced  with  COTS.  Programs  tended 
to  be  relatively  small,  and  interaction 
between  programs  consisted  primarily 
of  the  output  of  one  program  being  fed 
in  as  input  to  other  programs. 

As  computing  power  increased,  and 
computer  use  proliferated,  more  and 
more  programs  interacted  with  other 
programs,  and  all  interacting  programs 
ran  concurrently.  As  this  occurred,  soft¬ 
ware  manufacturers  filled  the  need  for 
common  programs  by  devising  pro¬ 
grams  that  were  not  made  to  run  inde¬ 
pendently,  but  instead  provided  partial 
solutions  to  the  overall  problem.  The 
intent  was  that  you  could  purchase  a 
product  that  would  meet  one  or  more  of 
your  requirements  and  simply  plug  in 
this  product. 

COTS  was,  at  one  time,  envisioned 
as  a  massive  time  and  money  saver.  You 
would  go  shopping  at  a  software  ware¬ 
house  and  select  COTS  systems  as 
needed.  It  would  seamlessly  interact 
with  your  organic,  homegrown  program 
and  also  with  any  and  all  other  COTS 
products  that  you  selected.  In  a  sense, 
COTS  would  operate  like  the  ubiquitous 
Universal  Serial  Bus  (USB)  plug  and  play 
devices  that  have  become  so  common. 

Unfortunately,  this  promise  of  plug 
and  play  software  was  not  as  effortless  as 
plug  and  play  was  for  hardware.  In  hard¬ 
ware,  there  are  very  specific  standards 
that  specify  how  the  interface  to  the 
computer  must  occur.  This  allows  a 
hardware  creator  to  design  a  product 
that  is  specific  in  its  purpose  (for  exam¬ 
ple,  a  256  Meg  flash  drive)  but  general  in 
its  interface  (so  that  when  inserted,  dri¬ 


vers  automatically  load  with  any  disk 
insertion  or  user  interaction).  In  hard¬ 
ware,  there  are  general  device  drivers 
that  have  extremely  specific  interface 
standards.  Plug  in  a  USB  drive  from 
almost  any  manufacturer  and  the  device 
will  work  without  any  specific  tailoring  of 
the  device  to  the  hardware. 

COTS,  on  the  other  hand,  typically 
do  not  have  specific  standards.  COTS 
requires  the  user  to  adapt  his/her  envi¬ 
ronment  to  use  the  COTS.  Whereas  the 
plug-and-play  hardware  has  a  known  and 
generalized  standard  that  aU  users  are 
willing  to  adapt  to,  COTS  requires  each 
user  to  adapt  their  individual  software 
to  interface  with  the  COTS.  Each  user 
typically  has  specific  and  unique  needs, 
yet  they  want  to  interact  with  general- 
purpose  COTS.  This  usually  causes 


problems  in  that  the  COTS  does  not 
meet  all  of  the  user  needs,  and  in  fact 
might  contain  functionality  that  the  user 
does  not  want.  Explaining  how  COTS 
interacts  with  other  applications  can  be 
better  understood  by  examining  two  rel¬ 
atively  common  software  engineering 
concepts:  coupling  and  cohesion. 
Coupling  and  cohesion,  which  are  rela¬ 
tively  basic  terms  in  computer  science, 
can  be  applied  to  COTS  to  help  deter¬ 
mine  COTS  quality  and  also  to  deter¬ 
mine  how  easy  it  will  be  to  integrate 
COTS  with  locally  developed  code. 

Coupling 

Coupling  refers  to  how  programs  inter¬ 
act.  There  is  a  range  of  how  multiple 
modules  interact  (see  Table  1  for  the 
different  types  of  coupling).  In  the  best 


Table  1:  Types  of  Coupling,  Tanked  From  Best  to  Worst 


Message  Coupling 
(low  coupling,  and 
the  best  type) 

This  is  the  ioosest  type  of  coupling.  Moduies  are  not 
dependent  on  each  other;  instead,  they  both  use  a  pubiic 
interface  to  pass  messages  between  them,  such  as  an 
object-oriented  message. 

Data  Coupling 

Data  coupiing  is  when  moduies  share  data  through,  for 
example,  parameters.  Each  datum  is  an  elementary  piece, 
and  these  are  the  oniy  data  that  are  shared  (e.g.  passing  an 
integer  to  a  function  that  computes  a  square  root). 

Stamp  Coupling 
(data-structured 
coupling) 

Stamp  coupiing  is  when  moduies  share  a  composite  data 
structure  and  use  oniy  a  part  of  it,  possibly  a  different  part 
(e.g.  passing  a  whole  record  to  a  function  which  oniy  needs 
one  fieid  of  it).  This  may  iead  to  changing  the  way  a  moduie 
reads  a  record  because  a  fieid  that  the  moduie  does  not 
need  has  been  modified. 

Control  Coupling 

Control  coupling  is  one  module  controlling  the  logic  of 
another  by  passing  it  information  on  what  to  do  (e.g.  passing 
a  what-to-do  fiag). 

External  Coupling 

External  coupling  occurs  when  two  modules  share  an 
externally  imposed  data  format,  communication  protocoi,  or 
device  interface  (e.g.  you  have  no  controi  over  the  interface). 

Common  Coupling 

Common  coupiing  is  when  two  moduies  share  the  same 
giobai  data  (e.g.  a  giobai  variabie).  Changing  the  shared 
resource  impiies  changing  aii  the  modules  using  it. 

Content  Coupling 
(highest  coupling, 
and  the  worst  type) 

Content  coupling  is  when  one  module  modifies  or  relies  on 
the  internai  workings  of  another  moduie  (e.g.  accessing  iocal 
data  of  another  moduie). 

Source:  <http://en. Wikipedia. org/wiki/Coupling_(computer_science)> 
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scenario,  modules  each  fulfill  a  specific 
purpose  and  no  direct  interaction  is 
required.  This  lack  of  coupling  is  per¬ 
fect  in  that  the  actions  of  one  program 
or  module  have  little  or  no  effect  on 
other  programs.  However,  this  type  of 
scenario  is  reminiscent  of  olden  days 
when  programs  ran  sequentially,  rather 
than  parallel.  Today,  COTS  almost 
always  requires  interactions  with  other 
concurrently  running  programs. 

Prior  to  C++,  the  best  type  of  cou¬ 
pling  was  referred  to  as  data  coupling.  In 
data  coupling,  all  interaction  is  defined 
by  parameter  calls.  This  type  of  cou¬ 
pling  is  simple  and  the  least  likely  to 
cause  a  ripple  effect  —  that  is,  when  a 
change  to  the  logic  of  one  module  caus¬ 
es  undesirable  effects  to  other  modules. 
The  advent  of  commonly  used  object- 
oriented  languages  has  led  to  a  new 
(lower,  and  therefore  better)  level  of 
coupling,  referred  to  as  message  coupling. 
In  a  well-designed  system,  low  coupling 
gives  COTS  the  ability  of  plug  and  play  in 
terms  of  a  standard  interface.  With  only 
message  coupling  or  data  coupling.,  it  should 
be  relatively  easy  to  remove  one  COTS 
module  and  replace  it  with  a  similar 
COTS  module  that  has  the  same  inter¬ 
face. 

Back  in  the  1970s,  the  first  car  I 
owned  —  a  1973  Chevrolet  Impala  —  had 
an  AM/FM  radio,  but  I  wanted  to 


replace  it  with  an  AM/FM/Cassette 
radio.  After  disassembling  the  dash¬ 
board  (no  small  feat),  I  removed  the 
radio  to  find  out  that  it  had  six  black 
wires,  one  white  wire,  and  one  green 
wire.  The  new  radio  had  seven  black 
wires  and  a  white/green  twisted  pair.  I 
had  no  idea  how  to  cleanly  replace  one 
radio  with  the  other.  A  more  recent  car, 
bought  a  few  years  ago,  had  a  standard¬ 
ized  harness  plug.  When  I  asked  to 
replace  the  standard  AM/FM/ Cassette/ 
CD  with  a  six- CD  radio,  it  took  about 
10  minutes  —  basically  just  unplug  the 
old  radio  and  plug  in  the  new.  This  is  the 
kind  of  interfacing  you  want  with  your 
software:  easy  to  uninstall  the  old,  easy 
to  install  the  new.  When  your  current 
COTS  supplier  goes  out  of  business  or 
no  longer  supports  your  version,  it 
should  be  easy  to  replace. 

It  is  not  important,  of  course,  to 
determine  exactly  what  the  level  of  cou¬ 
pling  is  —  it  is  more  important  to  keep  it 
low^.  Coupling  is  directly  related  to 
information  hiding,  along  with  data 
abstraction  and  data  design.  The  moral 
of  coupling  is  to  really  design  systems 
that  interact  (or  might  interact)  with 
COTS.  Use  interface  control  docu¬ 
ments.  Have  an  architectural  design 
document.  Review  data  requirements 
with  the  user,  have  a  data  dictionary  and 
data  design  document,  and  make  the 


COTS  interfaces  as  simple  and  straight¬ 
forward  as  possible.  Coupling  should  be 
kept  as  low  as  possible. 

Cohesion 

Cohesion  refers  to  the  measure  of  how 
COTS  performs  a  single  task.  It  is  a 
measure  of  the  stickiness  of  the  module. 
A  good  module  contains  only  resources 
that  accomplish  single  or  similar  tasks. 
The  parts  of  the  module  are  all  closely 
interrelated  and  therefore  stick  together. 
Table  2  explains  the  types  of  cohesion. 

Many  studies  have  shown  that  coin¬ 
cidental  cohesion  (such  as  one  routine 
that  would  calculate  tax  rate  and  compute 
Celsius  to  Fahrenheit)  and  logical  cohesion 
(such  as  open  all  input  files)  are  extremely 
inferior  to  other  types  of  cohesion^. 
Combining  many  functions  into  one 
module  makes  cohesion  low  and  con¬ 
tributes  to  high  error  rates  and 
increased  debugging,  testing,  and  inte¬ 
grating  time.  In  addition,  relatively  small 
maintenance  fixes  tend  to  have  ripple 
effects  that  require  testing  and  debug¬ 
ging  of  many  routines  that  are  not  logi¬ 
cally  related.  Good  developers  know 
that  an  application  that  has  many  rela¬ 
tively  simple  modules  is  easier  to  devel¬ 
op  and  test.  In  fact,  this  is  one  of  the 
basic  concepts  of  object-oriented  pro¬ 
gramming. 

For  most  computer  users,  the  pro¬ 
grams  we  use  have  a  single  purpose. 
Most  computer  users  have  learned  that 
it  makes  sense  to  have  a  single  word 
processor,  a  single  spreadsheet  pro¬ 
gram,  etc.  While  they  might  all  come 
from  the  same  software  developer  and 
be  designed  to  interact  together,  they 
are  still  separate  programs.  Back  in  the 
1980s,  there  was  a  push  to  deliver  all-in- 
one  software  that  combined  lots  of  func¬ 
tionality.  It  was  clumsy  to  use,  per¬ 
formed  poorly,  and  was  widely  hated  by 
most  users.  Instead,  single-purpose 
applications  (which  had  high  cohesion) 
that  had  well-defined  cut  and  paste  and 
hyperlink  interfaces  (low  coupling)  tend 
to  be  easier  to  use,  to  be  more  robust, 
and  to  include  more  functionality. 
COTS  works  the  same  way.  A  single¬ 
purpose  program  typically  works  better 
and  is  almost  always  easier  to  integrate. 

Cohesion  is  directly  related  to  mod¬ 
ularity  (whereas  coupling  was  directly 
related  to  information  hiding).  In  a  well- 
modularized  program,  routines  are  rea¬ 
sonably  small  and  perform  one  action. 
In  a  poorly  modularized  program,  rou¬ 
tines  grow  very  large  —  sometimes  thou¬ 
sands  of  lines.  Good  software  engineers 
know  that  it  is  easier  to  write,  test,  and 


Table  2:  Types  of  Cohesion,  Tanked  From  Test  to  Worst 


Functional  Cohesion 
(high,  and  the  best 
type  of  cohesion) 

Functional  cohesion  is  when  parts  of  a  module  are  grouped 
because  they  all  contribute  to  a  single,  well-defined  task  of 
the  module  (e.g.  calculating  the  sine  of  an  angle). 

Sequential  Cohesion 

Sequential  cohesion  is  when  parts  of  a  module  are  grouped 
because  the  output  from  one  part  is  the  input  to  another  part 
like  an  assembly  line  (e.g.  a  function  which  reads  data  from 
a  file  and  processes  the  data). 

Communicational 

Cohesion 

Communicational  cohesion  is  when  parts  of  a  module 
are  grouped  because  they  operate  on  the  same  data 
(e.g.  a  module  which  operates  on  one  record  of 
information). 

Procedural  Cohesion 

Procedural  cohesion  is  when  parts  of  a  module  are  grouped 
because  they  always  follow  a  certain  sequence  of  execution 
(e.g.  a  function  which  checks  file  permissions  and  then 
opens  the  file). 

Temporal  Cohesion 

Temporal  cohesion  is  when  parts  of  a  module  are  grouped 
by  when  they  are  processed  -  the  parts  are  processed  at  a 
particular  time  in  program  execution  (e.g.  a  function  which  is 
called  after  catching  an  exception  which  closes  open  files, 
creates  an  error  log,  and  notifies  the  user). 

Logical  Cohesion 

Logical  cohesion  is  when  parts  of  a  module  are  grouped 
because  they  logically  are  categorized  to  do  the  same  thing, 
even  if  they  are  different  by  nature  (e.g.  grouping  all 
input/output  handling  routines). 

Coincidental 

Cohesion  (low 
and  the  worst  type 
of  cohesion) 

Coincidental  cohesion  is  when  parts  of  a  module  are 
grouped  arbitrarily  (at  random);  the  parts  have  no  significant 
relationship  (e.g.  a  file  of  frequently  used  functions). 

Source:  <http://en.wikipedia.org/wiki/Cohesion_(computer_science)> 
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then  integrate  lots  of  small  routines 
rather  than  try  to  write  and  debug 
monolithic  modules  of  several  thou¬ 
sand  lines. 

Whereas  it  is  important  to  keep  cou¬ 
pling  low,  cohesion  should  be  kept  high. 
To  design  for  high  cohesion,  architec¬ 
tural  and  module  design  documents  are 
needed.  Good  requirements  are  a  must, 
as  are  traceability  matrixes.  A  traceabili¬ 
ty  matrix  allows  you  to  map  the  require¬ 
ments  directly  to  the  design  and  module 
and  encourages  you  to  have  small  mod¬ 
ules  that  meet  a  small  number  of 
requirements. 

Again,  there  are  metrics  to  measure 
cohesion,  just  as  there  are  metrics  to 
measure  coupling.  However,  it  is  not  as 
important  to  measure  coupling  or  cohe¬ 
sion;  it  is  more  important  to  try  and 
keep  coupling  low  and  cohesion  high. 
With  COTS,  you  want  to  insert  off-the- 
shelf  functionality  that  provides  both 
single  and  well-defined  functionality. 

COTS  Testing  and 
Requirements 

It  is  obvious  in  hindsight,  but  how  do 
you  know  that  the  COTS  products  that 
you  are  buying  will  meet  your  needs? 
Certainly  you  cannot  believe  the  mar¬ 
keting  hype,  and  because  COTS  prod¬ 
ucts  are  developed  for  many  users,  your 
application  might  have  unique  COTS 
needs  that  have  never  been  tested  for  or 
used  in  other  applications.  The  solution 
is  to  have  a  well  thought-out  test  plan 
that  makes  sure  that  your  unique  needs 
are  tested.  This  test  plan  needs  to  be 
developed  before  you  acquire  COTS. 
And,  to  develop  a  good  test  plan,  you 
need  good  requirements.  The  bottom 
line  is  that  if  you  acquire  COTS  before 
you  have  high-quality  requirements  that 
have  been  validated  by  the  user,  you  are 
likely  to  end  up  with  COTS  that  do  not 
meet  all  of  your  needs. 

Of  course,  maybe  there  are  no 
COTS  products  that  meet  all  of  your 
needs,  anyway.  This  is  a  common  occur¬ 
rence.  There  are  several  solutions: 
Either  modify  the  COTS  or  develop  so- 
called  glue  code  that  holds  the  COTS 
together. 

I  highly  recommend  against  attempt¬ 
ing  to  modify  COTS.  Such  attempts  typ¬ 
ically  lead  to  increased  cost  and  lengthy 
testing"*.  Instead,  consider  either  devis¬ 
ing  a  manual  work-around  or  implement 
a  limited  amount  of  glue  code.  However, 
keep  in  mind  that  if  the  glue  code  is 
complex  and  hard  to  maintain,  it  might 
overcome  the  cost  and  time  savings  of 


using  COTS. 

Version,  Speed,  and  Licenses 

Other  issues  affect  the  overall  cost  of  a 
COTS  system.  Systems  with  a  lot  of 
COTS  components  have  a  problem 
with  versioning,  both  with  the  version¬ 
ing  of  the  COTS  and  with  the  underly¬ 
ing  operating  system.  When  COTS  is 
updated  by  the  vendor,  it  requires  care¬ 
ful  regression  testing  to  make  sure  that 
the  newer  versions  of  COTS  continue 
to  meet  the  exact  form,  fit,  and  function  of 
the  previous  version.  In  addition,  care 
must  be  taken  that  newer  versions,  as 
they  are  released,  do  not  add  unwanted 
functionality  (and  vulnerabilities). 

A  second  issue  is  that  in  addition  to 
successive  versions  of  newer  COTS, 
each  COTS  package  might  want  specif¬ 
ic  versions  or  patches  to  the  operating 
system.  Each  might  want  a  separate  ver- 

highly  recommend 
against  attempting  to 
modify  COTS.  Such 
attempts  typically  lead  to 
increased  cost  and 
lengthy  testing.^* 

sion  of  the  Java  Machine  installed.  As 
updates  to  both  the  operating  system 
and  system  libraries  are  installed  (in 
some  cases  automatically),  how  do  you 
know  that  the  COTS  will  continue  to 
work?  Again,  a  test  team  must  be  con¬ 
tinually  testing  and  checking  to  make 
sure  that  updates  to  the  operating  sys¬ 
tem  do  not  cause  unintended  side 
effects.  It  helps  if  the  COTS  has  a 
scheduled  update  cycle  and  if  pre¬ 
release  versions  of  the  upcoming  COTS 
can  be  tested  early. 

A  third  issue  with  versioning 
becomes  important  when  you  have  mul¬ 
tiple  COTS  running.  As  the  matrix  of 
interrelated  COTS  packages  are  updat¬ 
ed,  interfaces  from  one  COTS  package 
might  interfere  with  other  COTS  inter¬ 
faces.  System  and  hardware  require¬ 
ments  might  conflict,  or  be  vastly  mag¬ 
nified  with  multiple  COTS  packages 
installed. 

To  address  all  of  these  issues,  a  test 
plan  (and  a  dedicated  test  team)  is  criti¬ 
cal.  Research  into  the  interconnection 
between  various  COTS  packages  is 


required  on  a  continual  basis.  In  the 
past,  the  principle  was  typically  it  run¬ 
ning  and  lock  down  the  entire  configuration 
and  baseline  the  system  (and  prevent 
continual  COTS  and  operating  system 
updates).  However,  because  of  security 
threats,  operating  systems  of  today 
require  continual  updating  to  meet 
increasing  security  threats.  It  is  no 
longer  practical  to  lock  down  an  environ¬ 
ment. 

Other  issues  with  COTS  involve  the 
performance  of  the  integrated  system. 
In  any  system,  it  is  important  to  test  (or 
at  least  prepare  for)  worst  case  conditions. 
Unfortunately,  as  requirements  change 
during  development,  worst  case  condi¬ 
tions  sometimes  change,  also.  When  you 
acquire  COTS  for  a  specific  need  and 
the  need  changes  during  development, 
performance  problems  can  occur.  The 
performance/speed  issues  can  also 
involve  hardware.  Remember  that  you 
need  to  develop  and  test  the  entire  sys¬ 
tem  for  not  only  the  average  user,  but 
also  the  worst-case  user.  Finding  out 
that  the  users  have  inadequate  hardware 
after  development  and  release  of  the 
software  is  bad.  Also,  remember  that 
hardware  preparation  also  includes  net¬ 
work  support.  It  would  be  catastrophic 
to  deliver  a  finished  system  only  to  find 
out  that  there  is  not  enough  available 
bandwidth  necessary  to  support  the 
new  COTS  database.  Again,  this  is 
something  that  can  be  avoided  by  hav¬ 
ing  good  requirements  and  by  having  a 
test  team  testing  against  the  require¬ 
ments. 

A  final  issue  is  licensing  costs.  For 
some  COTS,  a  separate  license  must  be 
acquired  for  each  user.  Do  you  know 
how  many  end-users  will  need  licenses? 
Is  there  an  acquisition  mechanism  in 
place  to  buy  and  distribute  the  licenses 
when  you  release  the  COTS?  Proper 
preparation  for  release  requires  that  you 
make  plans  for  licenses  and  license  dis¬ 
tribution. 

These  are  just  a  few  issues  that  affect 
the  overall  cost  of  COTS.  In  this  issue 
of  Crosstalk,  the  article  Added 
Sources  of  Costs  in  Maintaining  COTS- 
Intensive  Systems  by  Clark  and  Clark 
explores  these  additional  costs  in 
greater  detail. 

Conclusion 

COTS  acquisition  and  integration  are 
complex  topics  —  this  article  simply 
touched  on  some  of  the  basic  issues  to 
consider.  Coupling  and  cohesion  are  rel¬ 
atively  basic  computer  science  concepts, 
but  an  understanding  of  how  they  relate 
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to  good  software  engineering  can  make 
the  job  of  acquiring  COTS  easier.  From 
a  larger  perspective,  basic  COTS  ques¬ 
tions  can  be  answered  by  basic  soft¬ 
ware  engineering:  Gather  good  require¬ 
ments,  validate  the  requirements,  per¬ 
form  good  design,  and  have  a  test  plan. 
COTS  probably  will  not  solve  100  per¬ 
cent  of  your  requirements,  so  it  is 
important  to  know  what  you  are  willing 
to  settle  for.  Also,  understand  that 
COTS  might  save  you  time  and  money, 
but  trying  to  integrate  it  incorrectly  can 
cost  you  lots  of  money  and  will  not  nec¬ 
essarily  reduce  the  schedule  time.  Trying 
to  integrate  multiple  COTS  applications 
will  probably  magnify  your  potential 
problems;  so  learn  from  the  mistakes 
and  successes  of  others  whenever  pos¬ 
sible.  Be  wary  of  marketing  hype,  and 
remember  that  finding  another  user 
with  experience  with  your  potential 
COTS  product  is  one  of  the  best 
resources  of  all.^ 

Notes 

1 .  For  a  discussion  of  program  language 
evolution  and  programming  language 
interoperability,  refer  to  “Evolutionary 
Trends  of  Programming  Languages,” 


by  Thomas  Schorsch  and  David  Cook 
in  Crosstalk  Feb.  2003  at  <www. 
stsc.hill.af.mil/ crosstalk/ 2003/ 02/ 
schorsch.html>. 

See  “Software  Engineering,  A 
Practitioners  Approach”  by  Roger 
Pressman  for  a  good  description  of 
coupling,  cohesion,  information  hid¬ 
ing,  abstraction,  and  modularity.  You 
can  also  get  pointers  to  online  sources 
of  information  by  checking  out  coupling 
and  cohesion  on  Wikipedia. 

For  example,  see  E.  Yourdon,  and  L. 
Constantine.  “Structured  Design:  Fun¬ 
damentals  of  a  Discipline  of  Compu¬ 
ter  Program  and  Systems  Design.” 
Prentice-FIall,  Yourdon  Press. 

For  an  example,  see  “Business  Process 
Reengineering/Software  Modifica¬ 
tions”  OIG/97E-10  —  “Evaluation  of 
Best  Practices  for  Developing  and 
Implementing  Integrated  Financial 
Management  System  From  the  U.S.” 
by  the  Nuclear  Regulatory  Commis¬ 
sion,  available  at  <www.nrc.gov/read 
ing-rm  /  doc-collections  /  insp-gen  / 
1997/97e-10.html>. 
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Lean  AISF: Applying  COTS  to  System  Integration  Facilities 

Harold  Lowery 
Warner  Robins  Air  Rogistics  Center 

Rean  Avionics  Integration  Support  Facility  (AISF)  is  an  initiative  to  introduce  Rean  concepts  and  methods  to  the  F-15 
AISF.  Our  strategy  includes  the  use  of  commercial  off-the-shelf  (COTS)  and  open  source  software  where  appropriate.  In 
this  article,  the  author  briefly  describes  the  AISF  and  then  discusses  several  examples  of  using  COTS  to  reduce  maintenance 
costs  and  improve  performance. 


Modern  weapons  are  complex,  high- 
performance  systems.  Much  of  the 
performance  of  a  modern  weapon  system, 
as  well  as  its  complexity,  derives  from  the 
software  executing  on  computers  embed¬ 
ded  within  it.  It  should  come  as  no  surprise 
that  the  engineering  facilities  used  to  devel¬ 
op  and  maintain  these  weapon  systems  are 
themselves  complex  systems  that  require 
considerable  resources  to  operate  and 
maintain.  The  application  of  Rean  con¬ 
cepts  enables  significant  cost  reductions  in 
the  maintenance  of  system  integration  labs 
through  the  use  of  COTS  items  where 
appropriate.  This  article  describes  one  such 
facility,  the  AISF  located  at  Robins  Air 
Force  Base,  and  discusses  several  examples 
of  how  new  technology  impacts  it. 

Fighter  AISF 

In  order  to  discuss  Lean  AISF,  we  first 
must  discuss  the  Fighter  AISF  history. 

History 

The  Fighter  AISF  is  used  to  develop  and 
maintain  Operational  Flight  Program 
(OFP)  software,  primarily  for  the  F-15 
and  other  air  combat  platforms.  The  AISF 
achieved  initial  operation  in  the  early 
1980s  and  has  been  through  several  tech¬ 
nology  refresh  cycles  since  then.  The 
AISF  includes  a  number  of  system  inte¬ 
gration  benches.  These  benches  are  closed 
loop,  hardware -in-the-loop  systems  con¬ 
sisting  of  avionics  hardware,  signal  pro¬ 
cessing  hardware  and  software,  and  simu¬ 
lation  software.  The  OFPs  execute  in 
actual  aircraft  avionics  with  the  airframe 
and  operating  environment  simulated. 
The  intent  is  that  the  OFP  software  can¬ 
not  tell  the  difference  between  flying  in  an 
aircraft  and  flying  in  the  lab. 

Principles  of  Lean 

In  the  early  1990s,  researchers  began  dis¬ 
cussing  the  concept  of  a  Rean  approach  to 
manufacturing.  Womack,  Jones,  and  Ross 
introduced  the  term  Rean  when  describing 
the  Toyota  Production  System  as  part  of  a 
major  study  of  the  global  automotive 
industry  [1].  The  concepts  they  described 
—  focusing  on  the  value  stream  and  elimi¬ 


nating  waste  -  have  been  successfully 
applied  to  manufacture  and  repair 
processes  in  the  automotive  and  aerospace 
industries  for  some  time.  Innovative  orga¬ 
nizations  are  now  applying  Rean  principles 
to  their  design  and  product  development 
challenges.  The  emphasis  in  this  domain  is 
on  eliminating  waste,  particularly  in  make- 
vs-buy  decisions  [2]. 

Application  of  Lean 

Our  initiative  has  the  goal  of  transforming 
the  traditional  AISF  to  a  Lean  AISF,  by 
moving  from  obsolete  hardware/ software 
to  modern  systems  that  are  based  on 
COTS  equipment,  open  industry  stan- 


'T/ie  application  of  Lean 
concepts  enables 
significant  cost  reductions 
in  the  maintenance 
of  system  integration 
labs  through  the  use 
of  COTS  items 
where  appropriate/* 


dards,  and  open  source  software  where 
appropriate.  In  particular,  we  aim  to  lower 
the  cost  to  support  the  AISF  by  applying 
Lean  principles  to  product  development 
to  eliminate  waste  whenever  possible.  The 
expected  benefits  of  this  transformation 
are  reduced  hardware  maintenance  costs 
for  AISF  hardware,  easier  migration  of 
new  technology  into  existing  AISFs,  and 
reduced  development  costs  for  new  AISFs 
to  support  weapon  systems  currently  in 
development. 

Meeting  the  stringent  real-time  con¬ 
straints  of  simulating  a  fighter  requires 
significant  computing  horsepower.  The 
first,  second,  and  third  generations  of  the 
AISF,  like  aU  system  integration  facilities 
built  during  the  1980s  and  1990s,  were 


based  on  expensive  minicomputer  hard¬ 
ware  running  proprietary  operating  sys¬ 
tems  and  software  development  toolsets. 
In  addition,  a  large  investment  in  custom- 
designed  hardware  and  software  was  nec¬ 
essary  to  meet  the  system’s  requirements, 
using  the  then-available  technology.  In 
implementing  the  fourth  generation  AISF, 
our  aim  is  to  eliminate  waste,  especially  in 
the  make-vs-buy  decisions  that  so  strong¬ 
ly  drive  life-cycle  costs. 

Example  I :  Simulation 
Computers 

For  years,  we  have  used  minicomputers 
from  a  major  simulation  vendor  to  host 
our  real-time  simulation  software.  These 
machines  have  been  true  workhorses  for 
us,  but  with  the  passage  of  time  there 
were  several  reasons  to  move  to  newer 
technology.  First,  since  these  machines  are 
based  on  the  vendor’s  proprietary  hard¬ 
ware,  we  have  supported  them  via  vendor 
maintenance  contracts.  This  approach 
gave  us  superb  support  but  was  a  strain  on 
the  budget.  Second,  as  technology  has 
advanced,  our  options  for  upgrading  these 
computers  grew  limited.  For  example,  the 
largest  hard  drives  that  they  could  accom¬ 
modate  are  two  gigabytes:  This  was  great 
when  the  computers  were  new  in  1991  but 
rather  constrained  some  15  years  later. 

We  conducted  a  trade  study  to  evaluate 
three  alternatives.  First,  we  could  migrate 
from  our  existing  simulation  computers 
along  the  vendor’s  upgrade  path  to  their 
next  generation  product.  A  significant  fea¬ 
ture  of  this  alternative  is  the  move  to  the 
open  source  Red  Hat  Linux  operating  sys¬ 
tem  with  the  RedHawk  real-time  kernel. 
Second,  we  could  build  our  own  simula¬ 
tion  hosts  using  COTS  hardware  running 
Linux.  Third,  we  could  expand  the  search 
space  to  the  proprietary  simulation  prod¬ 
ucts  of  other  commercial  vendors. 

Alternatives  one  and  two  both  used 
standard  Intel-based  servers  running  a 
version  of  Linux.  The  tremendous  growth 
of  the  Internet  had  driven  massive  indus¬ 
try  investment  in  servers,  lowering  the  unit 
price  of  raw  processing  power.  Alternative 
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one  also  included  the  vendor’s  proprietary 
hardware  and  software  to  provide  a  sys¬ 
tem  optimized  for  real-time  processing, 
albeit  at  a  significantly  higher  price. 

At  first  it  was  tempting  to  believe  we 
could  assemble  a  solution  in-house  by 
using  off-the-shelf  hardware  (which  we 
would  buy  strictly  on  price)  and  installing 
Linux.  However,  to  re-host  our  legacy 
software  to  such  a  platform  would  require 
specific  real-time  capabilities  -  capabilities 
we  would  have  to  develop  from  scratch. 
As  we  began  to  tally  up  the  engineering 
development  costs,  it  became  clear  that 
cheap  hardware  could  be  too  expensive. 

Our  trade  study  evaluated  the  alterna¬ 
tives  on  the  basis  of  the  following:  1)  real¬ 
time  capability,  2)  supportabiUty  (over  a 
nominal  10-year  design  Ufe),  3)  purchase 
costs,  and  4)  transition  costs  (including 
costs  to  re-engineer  existing  simulation 
software).  Alternative  one  was  the  clear 
winner.  The  vendor’s  solution  provided  us 
an  upgrade  path  where  we  could  port  our 
large  legacy  code  base  with  minimal  effort 
relative  to  other  approaches.  Although  we 
could  have  bought  equivalent  hardware 
for  half  the  price  from  other  sources,  the 
ability  to  quickly  port  our  large  legacy 
code  base  was  a  value  proposition  that 
surpassed  the  other  alternatives. 

Lesson  Learned 

Hardware  may  be  cheap,  but  software 
engineers  are  expensive.  When  dealing 
with  legacy  systems,  we  found  that  the 
most  cost-effective  approach  is  generally 
the  one  that  minimizes  the  software  re¬ 
host  effort. 

Example  2:  Bus  Interface 
Cards 

The  H009  multiplex  bus  was  an  early  fore¬ 
runner  of  the  Military  Standard  (MIL- 
STD)-1553B  data  bus  that  has  become 
standard  in  military  and  even  commercial 
aircraft.  Since  H009  was  never  as  widely 
adopted  as  1553B,  there  have  always  been 
relatively  few  suppliers  of  this  hardware. 

From  the  early  days  of  the  AISF,  we 
made  significant  investments  in  designing, 
building,  and  maintaining  custom  H009 
interface  cards  for  the  AISF.  Our  most 
recent  implementation  was  designed  in  the 
early  1990s  and  had  become  unsupport- 
able  due  both  to  obsolescence  and  per¬ 
sonnel  turnover.  We  had  entered  the  H009 
business  simply  because  at  the  time  we  felt 
there  were  no  viable  commercial  alterna¬ 
tives.  In  recent  years,  the  engineering 
expertise  to  support  this  very  specialized 
design  across  a  small  installed  base  (about 
12  units,  total)  had  eroded  significantly. 


We  had  a  strong  desire  to  stop  supporting 
in-house  custom  solutions,  and  by  2005 
several  vendors  were  offering  H009  prod¬ 
ucts. 

As  we  investigated  them,  it  quickly 
became  clear  that  none  would  operate  in 
our  system  without  a  significant  rewrite  of 
our  existing  software.  In  our  third-genera¬ 
tion  hardware  design,  the  software  engi¬ 
neers  had  requested  a  number  of  features 
they  thought  would  be  needed.  Over  the 
last  dozen  years  we  had  learned  that  some 
of  those  feamres  were  seldom,  if  ever, 
used  -  a  form  of  waste.  In  effect,  our 
board  had  been  designed  with  some  capa¬ 
bilities  that  were  unnecessary  and  with 
others  that  were  perhaps  better  done  in 
software.  In  order  to  use  the  available 
COTS  hardware,  we  would  have  to 
migrate  some  of  the  functionality  of  our 
custom  hardware  into  an  enhanced  ver¬ 
sion  of  our  software. 

**Hardware  may  be 
cheap,  but  software 
engineers  are  expensive. 
When  dealing  with 
legacy  systems,  we 
found  that  the  most 
cost-effective  approach  is 
generally  the  one  that 
minimizes  the  software 
rehost  effort^* 


We  had  to  trade  off  the  costs  of 
implementing  a  new  custom  hardware 
design  and  then  supporting  it  for  a  num¬ 
ber  of  years  versus  the  one-time  cost  to 
modify  the  legacy  interface  software  to 
accommodate  the  feature  set  offered  by 
off-the-shelf  solutions.  Another  factor  we 
considered  in  our  analysis  was  available 
support  for  the  COTS  equipment. 
Fortunately,  the  F-15  is  gradually  migrat¬ 
ing  away  from  the  H009  bus  to  the  much 
better  supported  MIL-STD-1553B  bus. 
By  provisioning  the  proper  number  of 
spare  cards,  we  expect  to  support  the 
H009  bus  for  as  long  as  it  remains  in  use 
on  the  aircraft. 

Lesson  Learned 

A  one-time  investment  of  engineering 
dollars  can  be  cost  effective  if  it  allows  the 


use  of  COTS  equipment  and  eliminates 
the  engineering  effort  required  to  design 
and  support  an  in-house  solution  over  a 
period  of  years. 

Example  3 

In  the  first  generation  AISF,  circa  early 
1980s,  we  used  real  aircraft  control  panels 
in  our  cockpit  mock-ups.  Although  these 
gave  the  user  a  realistic  experience  in  the 
lab,  there  were  several  drawbacks  with 
them.  Aircraft  hardware  is  expensive  to 
obtain,  difficult  to  maintain,  and  has  to  be 
interfaced  to  the  simulation  computers 
using  custom  hardware.  In  our  second- 
generation  designs  (late  1980s),  we  began 
experimenting  with  touch-screen  equip¬ 
ped  PCs  as  replacements  for  aircraft  con¬ 
trol  panels.  This  approach  eliminated  air¬ 
craft  hardware  while  still  allowing  us  suffi¬ 
cient  realism  for  the  purposes  of  OFP 
development.  However,  implementing 
that  approach  required  developing  the 
software  to  display  buttons,  switches,  etc., 
and  to  respond  to  the  user  as  he  or  she 
activated  these  simulated  controls.  At  the 
time,  this  meant  a  significant  investment 
in  custom  software  development. 

Fast  forward  to  2006.  Our  original 
touch  screen  PC  hardware  had  been 
replaced  several  times,  but  the  software 
had  been  modified  only  slightly  over  the 
years  and  was  in  definite  need  of  a  major 
overhaul.  But  now  we  had  options.  The 
market  for  PC  graphics  software  has 
greatly  increased  and  several  vendors 
offered  promising  products  —  the  promise 
being  that  re -implementing  our  existing 
applications  would  be  as  easy  as  drawing 
the  panels  using  the  vendor’s  graphical 
editors.  The  old-timers  among  the  techni¬ 
cal  staff  were  skeptical  that  it  could  be 
that  easy,  while  the  younger  engineers 
were  eager  to  try  out  new  toys. 

We  evaluated  various  products  and 
then  made  the  investment.  By  using  the 
vendor’s  tool,  a  trained  engineer  could 
prototype  a  control  panel  in  a  fraction  of 
the  time  it  would  have  taken  with  hand¬ 
crafted  code. 

However,  what  the  tool  saved  us  in 
creating  panels  it  took  back  in  time  to  inte¬ 
grate  them  when  new  hardware  arrived. 
One  significant  problem  involved  a  Linux 
driver  that  assumed  a  specific  hardware 
configuration  different  from  what  we  had 
purchased.  In  the  end,  a  senior  engineer 
rewrote  the  driver  so  that  all  the  pieces 
would  play  together. 

Lesson  Learned 

The  young  engineers  were  right  that  the 
COTS  tools  would  simplify  the  process  of 
generating  control  panels.  But  the  grey- 
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beards  were  right  too.  There  are  always 
integration  issues,  and  it  is  precisely  this 
point  where  one  vendor’s  product  meets 
another’s  that  the  hard  work  usually  takes 
place. 

Conclusion 

A  Lean  approach  to  AISF  development 
and  support  would  eliminate  waste  when¬ 
ever  possible.  COTS  products  can  be 
incorporated  to  great  advantage  if  the 
engineering  staff  carefully  weighs  all  alter¬ 
natives  when  considering  make-versus- 
buy  decisions.  In  those  cases  where  a 
COTS  product  is  appropriate,  it  can  elim¬ 
inate  the  waste  of  supporting  a  custom 
solution  using  expensive  in-house  engi¬ 
neering  talent.  As  always,  it  is  important  to 
clearly  define  the  trade-offs  and  ramifica¬ 
tions  of  using  a  COTS  product.^ 
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munities,  maturing,  and  transitioning 
practical  solutions  into  DoD  organiza¬ 
tions  and  the  wider  community. 

Ten  Commandments  of 
COTS 
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Interest  in  COTS  products  requires 
examination  both  in  terms  of  its  causes 
and  effects,  and  in  terms  of  its  benefits 


and  liabilities.  The  Defense  Acquisition 
University  offers  some  observations  and 
voices  some  specific  concerns  and  criti¬ 
cisms.  They  stress  that  their  observations 
are  essentially  cautionary,  not  condem¬ 
natory:  Huge  growth  in  software  costs 
will  continue,  not  abate,  and  appropriate 
use  of  commercially  available  products  is 
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acquire  needed  capabilities  in  a  cost- 
effective  manner.  Where  use  of  an  exist¬ 
ing  component  is  both  possible  and  fea¬ 
sible,  it  is  no  longer  acceptable  for  the 
government  to  specify,  build,  and  main¬ 
tain  a  comparable  product. 
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equipment.  The  subscription  is  free 
online,  and  archives  can  be  accessed  via 
the  Web  site  without  signing  up  just  by 
clicking  on  the  archives  button  on  the 
menu  bar. 
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GL  Studio  Brings  Realism  to 
Aircraft  Cockpit  Simulator  Displays 


Kim  Stults 

580th  Software  Maintenance  Squadron 

Testing  operational flight  programs  (OTPs)  for  aircraft  requires  that  the  user  be  able  to  enter  data  and  generate  output  via  a 
graphical  user  interface  (GUI).  Desiring  to  enhance  the  realism  for  the  test  team  and  upgrade  the  outdated  software,  we  began 
the  search  for  a  new  tool  A  new  commercial  off-the-shelf  (COTS)  product  was  selected  and  a  success  story  unfolded.  This 
article  presents  that  story. 


In  1 995,  after  a  decade  of  development, 
the  Special  Operation  Forces  (SOF) 
Extendable  Integration  Support  Environ¬ 
ment  (EISE)  was  established  at  Robins 
Air  Force  Base  in  Warner  Robins, 
Georgia.  The  purpose  of  the  SOF  EISE 
was  to  provide  hardware  and  software 
support  for  seven  selected  avionics  sys¬ 
tems  (see  Figure  1).  The  support  environ¬ 
ment  permits  the  modification  and  test  of 
the  OFPs  running  on  the  aircraft. 

A  key  component  of  the  SOF  EISE  is 
the  crew  interface  (GIF)  simulation.  It  is 
designed  to  allow  real-time,  simultaneous 
operations  with  the  line  replaceable  unit 
and  environment  simulation  portions  of 
SOF  EISE.  The  computer  display  touch¬ 
screens  provide  a  simulation  of  the  actual 
aircraft  cockpits.  The  GIF  stands  as  the 
key  control  element  between  the  system 
user  and  both  the  real  and  simulated  air¬ 
craft  avionics.  It  provides  a  means  by 
which  the  functional  capability  of  simulat¬ 
ed  aircraft  avionics,  real  aircraft  avionics, 
and  their  associated  OFPs  can  be  evaluat¬ 
ed  against  the  required  capability  for  these 
system  components. 

Problem 

After  almost  20  years,  the  architecture  of 
the  GIF  segment  was  becoming  unsup- 
portable.  A  unique  GIF  simulation  execut¬ 
ed  on  a  single-board  computer  (SBG)  for 
each  of  the  seven  systems.  The  SBG  code 
communicated  with  two  silicon  graphic 


Figure  1:  SOT  EISE  Supported  Aircraft 


SOF  Aircraft 

Deployment 

Date 

MH-53J  PAVE  LOW  III 
(PL3) 

1995 

AC-130H  Gunship  (GS) 

1997 

MC-130H  Combat 

Talon  II  (CT2) 

1998 

MC-130E  Combat 

Talon  I  (CT1 ) 

1999 

MH-53M  Pave  LOW  IV 
(PL4) 

1999 

EC-130H  Compass  Call 
(CC) 

2003 

HH-60G  PAVE  HAWK 
(PH) 

2006 

computers  over  a  network  using  a  remote 
procedure  call  protocols  (RPG).  Three 
processes  ran  on  one  of  the  silicon  graph¬ 
ics  (client,  server,  and  GUI),  and  two 
processes  ran  on  the  second  (client  and 
GUI).  These  processes  communicated 
using  GOTS  software  library  calls.  The 
hardware  was  becoming  unsupportable 
and  the  software  solutions  were  outdated. 
The  segment  was  incredibly  complex  and 
resulted  in  reliability  problems  as  well  as 
high  maintenance  costs. 

Potential  Challenges 

Even  though  a  GOTS  product  had  been  in 
place,  high  maintenance  cost,  poor  cus¬ 
tomer  support,  and  an  unreliable  develop¬ 
ment  tool  led  to  an  investigation  of  new 
possibilities  for  today’s  technology  for 
both  hardware  and  software.  Our  require¬ 
ments  fell  into  basically  two  categories: 

1.  Functional  requirements.  These 
consisted  primarily  of  requirements 
elicitation  from  our  test  group  cus¬ 
tomer.  The  test  group  had  written 
extensive  procedures  over  the  years 
that  referred  to  specific  graphical 
objects  on  the  interface.  The  steps 
included  observe  actions  that  specified 
color,  position,  button  presses,  and 
other  very  specific  details  of  the  lega¬ 
cy  graphics.  This  was  a  justifiable  con¬ 
straint  given  the  level  of  effort  that 
would  be  required  to  change  thou¬ 
sands  of  pages  of  test  procedures. 

2.  Non-functional  requirements.  These 
are  essential  to  retarget  the  GIF  GUI 
that  resided  on  a  silicon  graphics 
machine  to  a  Linux  based  system,  bet¬ 
ter  performance,  and  higher  reliability. 

COTS  Considerations 

After  it  was  decided  that  the  functional 
capabilities  of  the  legacy  system  would 
determine  the  requirements  for  the 
upgrade,  a  development  team  met  to  dis¬ 
cuss  the  vision  for  the  upgrade. 
Discussion  focused  on  previous  problems 
with  vendors,  ease  of  debugging,  and  the 
future  of  the  system.  From  this  discus¬ 
sion,  it  was  determined  that  we  were  look¬ 


ing  for  the  following  [1]: 

•  A  fully  interactive  2-D  or  3-D  open 
graphic  library-based  development 

tool.  We  needed  a  tool  that  would 
allow  us  to  create  custom  widgets  that 
looked  exactly  like  the  legacy  widgets. 
Only  tools  with  the  ability  to  create  2- 
D  objects  could  do  that.  We  also  want¬ 
ed  the  capability  to  improve  the 
appearance  of  those  objects  that  did 
not  impact  test  procedures.  In  the 
future,  we  will  need  to  support  new 
platforms.  For  those  new  platforms, 
we  win  not  be  restricted  to  legacy 
appearances.  For  these,  we  would  like 
to  create  more  photo-realistic  panels. 
There  is  a  growing  desire  for  out-the- 
window  views  by  the  users.  Should 
that  ever  become  a  requirement,  we 
want  to  be  able  to  meet  it  without  hav¬ 
ing  to  change  tools.  A  3-D  tool  is 
required  for  that. 

•  Non-proprietary,  human-readable, 
object-oriented  code.  Historically, 
there  have  been  issues  with  the  legacy 
tool  that  required  us  to  examine  the 
interim  files  that  were  generated. 
Because  the  interim  files  were  written 
in  a  proprietary  format,  we  were 
unable  to  analyze  certain  conditions 
while  debugging.  This  significantly 
hampered  our  ability  to  quickly  correct 
some  defects,  and  we  were  not  anxious 
to  subject  ourselves  to  that  limitation 
again. 

•  Lower  development  cost.  The  legacy 
tool  was  old  and  its  customer  base  was 
decreasing.  Licensing  fees  were 
increasing.  Expertise  with  the  tool  was 
limited.  All  these  factors  forced  the 
cost  of  the  development  tool  to  rise. 
Newer,  cheaper,  and  more  capable 
tools  existed.  We  would  take  a  cursory 
look  at  a  few  of  them  and  determine 
which  of  those  deserve  further  evalua¬ 
tion. 

•  Efficient  object-oriented  designs 
and  code  generation.  An  object  ori¬ 
ented  approach  to  programming  is 
accepted  industry-wide.  While  it  is  not 
a  function  of  the  tool  to  provide  the 
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approach,  some  tools  make  it  easier 
than  others.  We  wanted  a  tool  that 
would  support,  even  enforce,  object 
oriented  code. 

•  A  compact  runtime  library.  Using 
our  legacy  tool,  the  libraries  had 
become  nearly  unmanageable  due  to 
increasing  size.  While  the  increases 
were  mandated  by  increased  capability, 
they  were  stiU  consuming  disk  space. 

•  Flexible  licensing  options.  As  the 
number  of  supported  platforms 
grows,  we  needed  to  be  able  to  adjust 
our  licensing  agreements.  We  needed 
to  be  able  to  establish  a  fixed  number 
of  development  licenses  and  a  differ¬ 
ent  number  of  runtime  licenses.  We 
needed  the  option  of  a  site  license,  in 
the  event  the  requirements  dictated. 
That  is,  in  the  event  the  number  of 
platforms  we  were  required  to  support 
grew  to  the  point  that  it  was  economi¬ 
cally  feasible  to  get  a  site  license 
instead  of  individual  run-time  license. 

•  Proven  COTS  product  with  some 
demonstrated  level  of  maturity. 
There  are  a  plethora  of  tools  on  the 
market  that  meet  our  needs.  Many  of 
them  are  excellent,  some  are  only 
good.  We  had  neither  the  time,  nor  the 
engineering  resources  to  evaluate  aU, 
or  even  most  of  them  in  depth.  We 
elected  to  rely  on  the  tool’s  customer 
base  to  do  that  for  us.  Mature  tools 
have  a  large  customer  base.  We  wanted 
an  alternative  pool  for  advice,  cus¬ 
tomers  who  were  already  using  the 
tool. 

•  COTS  vendor  with  good  technical 
support.  Every  new  tool  comes  with 
the  promise  of  technical  support.  We 
wanted  to  be  sure  that  we  were  getting 
more  than  promises.  The  vendor  we 
eventually  selected  provided  excellent 
technical  support  during  the  evalua¬ 
tion  period.  We  described  several 
unique  problems  and  they  promptly 
provided  explanations  or  solutions. 
Certainly  there  are  a  lot  of  graphics 

products  on  the  market.  We  looked  at  sev¬ 
eral  and  had  vendors  come  in  to  demon¬ 
strate  their  products.  We  obtained  refer¬ 
ences  from  those  companies  and  contact¬ 
ed  their  customers.  One  can  usually  learn 
more  from  a  vendor’s  customer  than  a 
vendor  representative.  For  example,  you 
can  find  out  what  to  expect  in  the  compa¬ 
ny’s  training  courses,  the  quality  of  the 
technical  support,  the  tool’s  ease  of  use, 
etc.  After  making  an  initial  company  selec¬ 
tion  for  prototyping,  we  arranged  for  a 
week-long  training  session  at  a  facility  near 
our  site.  The  instructor  was  technically 
competent  and  a  veteran  in  front  of  the 


Implemented  Configuration 


Hardware 


Pentium  4  processor,  3.6  GHz,  Intel  EM64T,  nVidia, 
Quadro  FX  3400  dual  Video  Graphics  Array  or 
Digital  Visual  Interface  graphics  card 

Redhat  Enterprise  version  4.0 
DiSTI’s  GL  Studio 


Linux  PC 
(Silicon  Graphics  Replacement) 


Figure  2:  Selected  Architecture  for  Configuration 


classroom.  Fie  provided  satisfactory 
answers  and  examples  for  all  of  our  ques¬ 
tions.  Most  of  the  questions  were  drawn 
from  what  we  thought  would  be  difficul¬ 
ties  with  using  his  company’s  tool.  In 
every  case,  we  were  more  than  satisfied 
with  his  response. 


dinates  were  being  returned  from  mouse 
events  and  the  dual  screen  implementation 
was  a  non-issue.  The  vendor  had  assured 
us  that  this  would  not  be  a  problem,  but 
we  had  no  way  to  prove  it.  Therefore,  a  lot 
of  time  had  been  spent  on  risk  mitigation 
that  was  completely  unnecessary. 


Obstacles 

Change  is  not  always  welcome.  There 
were  concerns  in  moving  from  the  famil¬ 
iar  to  the  unfamiliar.  We  had  previously 
faced  difficulties  with  customer  support, 
so  a  good  working  relationship  with  the 
vendor  was  crucial.  There  were  concerns 
from  the  test  team  that  the  look  and  feel 
of  the  test  station  not  be  changed  for  fear 
of  impact  to  the  current  test  procedures. 

The  other  obstacles  we  encountered 
were  typically  hardware-related  or  self 
imposed.  The  original  Linux  personal 
computer  we  were  given  for  development 
had  an  incompatible  operating  system 
(OS)  and  graphics  card  for  the  tool  we 
were  evaluating.  Our  first  step  was  to 
acquire  a  compatible  OS.  At  this  point 
though,  the  hardware  acquisition  still 
lagged  behind  the  software  we  were  devel¬ 
oping.  The  integrated  product  would  be 
driving  two  monitors.  We  did  our  early 
development  on  systems  that  only  had 
one.  We  were  concerned  that  mouse  clicks 
on  the  second  monitor  would  not  return 
the  screen  coordinates  we  were  expecting. 
We  tasked  one  of  our  developers  to  exam¬ 
ine  the  possibility  of  intercepting  the  uni¬ 
versal  serial  bus  event  stream  and  modify¬ 
ing  it.  This  was  a  risk  mitigation  step.  Our 
next  step  was  to  choose  a  multi-headed 
graphics  card  to  work  with  our  selected 
OS.  Finally,  we  had  to  find  a  touch  screen 
that  had  existing  drivers  for  our  chosen 
linux  OS.  Once  aU  of  our  hardware  was  in 
place,  we  knew  that  the  (x,y)  desired  coor¬ 


The  Solution 

The  ultimate  architecture  for  our  solution 
was  a  combination  of  procedural  pro¬ 
gramming  in  the  system  simulation  and 
object-oriented  programming  in  the  GUI. 
Figure  2  depicts  the  selected  architecture 
for  our  solution.  DiSTI  GL  Studio  was  a 
perfect  fit  for  our  COTS  product  require¬ 
ments.  GL  Studio  is  a  premier  Human 
Machine  Interface  (HMI)  development 
toolkit  that  allows  for  the  creation  of  end- 
to-end  safety  critical  displays  from  proto¬ 
type  to  delivery.  GL  Studio  has  no  propri¬ 
etary  formats  and  flexible  licensing 
options,  and  it  produces  reliable,  safe,  effi¬ 
cient,  reusable  applications  in  a  rapid  and 
easy  fashion.  The  GL  Smdio  development 
package  was  a  fraction  of  the  cost  of  our 
previous  COTS  product.  The  licensing 
fees  alone  will  save  approximately 
$120,000  over  the  next  five  years.  The  PCs 
win  save  an  additional  $42,000  over  the 
same  time  period. 

Keys  to  Successful  Integration 

Our  COTS  integration  was  completed 
ahead  of  schedule  and  within  budget.  The 
integration  effort  was  a  success  story  for 
many  reasons. 

Partnering  With  the  Vendor 

We  received  and  continue  to  receive  excel¬ 
lent  technical  support  from  DiSTI.  Some 
vendors  have  a  take-it-or-leave-it 
approach.  We  were  a  team  and  they  want¬ 
ed  us  to  succeed.  We  were  not  the  only 
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ones  working  weekends  when  we  had  a 
problem  to  solve.  Particularly,  they  worked 
one  weekend  to  develop  a  drop-in  keypad 
class  for  us.  In  the  beginning,  we  had 
issues  with  the  particular  OS  we  had  cho¬ 
sen.  It  was  not  supported  by  GL  Studio, 
but  they  helped  us  get  it  working  anyway. 
They  also  had  many  customers  running 
multi-headed  touch  screen  applications  in 
Windows,  but  we  were  the  first  to  attempt 
it  with  Linux.  Thanks  to  DiSTTs  support, 
we  were  able  to  implement  our  desired 
design. 

Networking  With  Simiiar  Users 

DiSTI  networked  us  together  with  the  C- 
130  Self-Contained  Navigation  System 
(SCNS).  ARINC  had  previously  designed 
photo-realistic  pilot  panels  for  the  SCNS 
project.  We  were  able  to  obtain  the 
reusable  software  objects  for  several  pilot 
instruments  and  implement  them  into  our 
architecture  seamlessly.  This  saved  a 
tremendous  amount  of  time  and  money. 
For  our  initial  delivery,  we  did  not  have 
time  in  the  schedule  to  photograph  our 
own  instruments  and  code  the  displays. 
Having  a  product  that  was  already  being 
used  with  displays  that  we  needed  was  a 
life  saver. 

Seiecting  the  Hardware 

Implementing  the  design  of  one  multi¬ 
head  PC  with  two  monitors  -  on  each  of 
the  five  platforms  —  reduced  the  10 
required  host  computers  to  five.  This  sin¬ 
gle  PC  architecture,  along  with  utilizing  a 
shared  memory  segment  to  replace  the 
outdated  communications  library,  greatly 
simplified  our  code  —  eliminating  approxi¬ 
mately  25,000  lines  of  code. 

Assessment  of  Results 

The  overall  results  for  our  upgrade  have 
been  outstanding.  At  this  time,  all  of  the 
platforms  have  been  successfully  ported 


Figure  3  a  and  b:  CT/  P//o/  Display 


to  the  new  architecture  and  have  been  in 
use  for  approximately  six  months.  Not 
only  has  the  integration  been  a  success  for 
the  development  team,  but  for  the  end 
users  as  well. 

Customer  Satisfaction 

Even  though  our  customer,  OFP  develop¬ 
ers,  and  testers  were  leery  of  a  change  in 
the  beginning,  they  have  been  very  pleased 
with  the  improvements  in  our  architec- 
mre,  the  most  notable  improvement  being 
the  photo-realistic  displays  for  the  pilot 
instruments  (Figure  3,  a  and  b).  This,  of 
course,  enhances  the  realism  of  flight  sim¬ 
ulation  and  test. 

*Vse  of  COTS  products 
is  a  viable  way  to 
upgrade  existing 
systems/* 

Sharing  Across  Muitipie  Piatforms 

A  huge  success  of  the  EISE  in  general  is 
the  ability  to  share  software  across  multi¬ 
ple  platforms.  The  object-oriented  design 
approach  saved  us  a  substantial  amount  of 
time  by  providing  the  capability  to  reuse 
components  instead  of  redeveloping  indi¬ 
vidual  instruments  or  instruments  parts. 
The  new  COTS  software  also  supports 
multiple  OSs,  leaving  the  future  possibility 
for  further  upgrades  and  configuration 
changes  completely  open  and  easy  to 
maintain. 

Conclusion 

Use  of  COTS  products  is  a  viable  way  to 
upgrade  existing  systems.  As  part  of  our 
development  plan,  the  port  was  complet¬ 
ed  in  incremental  steps.  In  order  to  mini¬ 


mize  risk,  our  first  step  was  to  port  only 
the  code  that  ran  on  the  silicon  graphics 
machine  and  to  maintain  the  original 
interfaces  between  the  SBC  code  and  the 
silicon  graphics.  Once  the  new  graphics 
were  in  place,  the  second  phase  was  to 
restructure  the  SBC  code  and  make  min¬ 
imal  interface  changes.  This  approach 
worked  very  well  for  our  initial  delivery. 

Now  that  we  have  transitioned  to  the 
new  architecture,  we  have  many  options 
available  to  pursue.  The  most  important 
option  is  the  addition  of  more  photo¬ 
realistic  displays  as  time  and  schedule 
permits.  The  Linux  solution  was  a  good 
fit  with  the  planned  migration  to  a  real¬ 
time  Linux  architecture  in  the  lab.  In  gen¬ 
eral,  the  COTS  HMI  development  tool 
was  able  to  help  us  save  a  substantial 
amount  of  time  and  budget,  allowing  us 
to  complete  our  objective  well  within  our 
required  deadline.  The  new  architecture 
of  the  instrumentation  lends  itself  for 
use  in  many  other  applications  for  the 
future  as  well,  further  saving  time  and 
budget  for  the  SOF  EISE  lab  as  well  as 
any  other  entity  that  may  receive  this 
source  as  government-furnished  infor¬ 
mation  for  other  similar  programs. ♦ 
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As  mission-critical,  real-time  software  systems  grow  in  sis^  and  complexif,  there  is  increasing  pressure  to  incorporate  com¬ 
mercial  off-the-shelf  (COTS)  components  as  part  of  the  strategy  for  reducing  the  total  costs  of  developing  and  maintaining 
these  systems.  In  the  traditional  information  technokigg  (IT)  space,  the  object-oriented  features  of  the  Java  language  have 
proven  their  value  in  enabling  significant  increases  in  software  reuse,  and  Java  has  now  overtaken  C  and  C++  as  the  lan¬ 
guage  of  choice  for  most  enterprise  IT projects.  This  article  discusses  the  use  of  Java  in  mission-critical,  real-time  systems  and 
emphasises  approaches  that  address  common  requirements for  portable,  efficient,  responsive,  and predictable  real-time  systems. 
These  approaches  maks  it  possible  to  develop  both  sofi  and  hard  real-time  components,  which  become  COTS  real-time  com¬ 
ponents  for  integration  within  future  real-time  systems.  Also  supported  is  the  ability  to  integrate  non-real-time  COTS  Java 
components  within  systems  that  have  high  reliability  and  real-time  constraints. 


Moore’s  law  [1],  proven  by  more  than 
four  decades  of  experience, 
describes  the  well-known  phenomenon 
that,  for  a  given  price  point,  commer¬ 
cially  available  computational  capacity 
doubles  every  18  months.  This  increas¬ 
ing  computational  capacity  is  used  by 
application  software  to  provide 
improved  functional  capabilities,  to 
respond  to  increasing  numbers  of  ser¬ 
vice  requests  with  shorter  response 
times,  and  to  support  more  efficient 
operations  (better  fuel  efficiency, 
reduced  system  failures  and  down  time, 
and  increased  productivity).  Studies  of 
embedded  system  trends  demonstrate 
that  the  size  of  software  in  embedded 
systems  also  grows  exponentially,  dou¬ 
bling  in  size  every  18-36  months, 
depending  on  the  industry  [2,  3]. 

Many  of  today’s  typical  embedded 
real-time  systems  are  comprised  of  hun¬ 
dreds  of  thousands,  if  not  millions  of 
lines  of  code.  Rather  than  develop  all  of 
the  code  required  for  each  new  product 
release  from  scratch,  the  majority  of 
today’s  software  growth  results  from  inte¬ 
gration  of  third-party  software  compo¬ 
nents  and  the  melding  of  independently 
developed  software  systems  into  larger 
integrated  systems  offering  the  combined 
capabilities  of  each  individual  part. 

This  article  discusses  the  use  of  the 
real-time  Java  language  as  an  enabling 
technology  to  greatly  reduce  the  efforts 
associated  with  developing  and  maintain¬ 
ing  reusable  real-time  software  compo¬ 
nents.  The  discussion  also  addresses  the 
need  to  integrate  within  real-time  systems 
certain  components  that  were  not  origi¬ 
nally  developed  with  the  rigor  or  discipline 
typical  of  real-time  software.  These  non- 
real-time  software  components  can  be 
successfully  deployed  within  real-time  sys¬ 
tems  as  long  as  systems  are  carefully  con¬ 
structed  to  assure  that  less  disciplined 


components  do  not  steal  resources  from 
the  allotments  for  real-time  components 
or  compromise  the  timely  execution  of 
them. 

What  Is  Real-Time  Software? 

The  correctness  of  a  real-time  system 
depends  not  only  on  delivering  correct 
computational  results,  but  also  on  deliv¬ 
ering  these  results  at  the  correct  time.  If 
results  are  delivered  too  early  or  too  late, 
the  real-time  system  is  operating  incor¬ 
rectly.  It  is  the  software  developer’s 


'The  resource  needs 
of  each  hard  real-time 
component  are 
determined  through 
careful  theoretical 
analysis  ... 

responsibility  to  assure  that  the  system 
operates  correctly.  Real-time  software 
can  be  divided  into  two  broad  cate¬ 
gories:  hard  real-time  and  soft  real-time. 

Hard  real-time  systems  are  developed 
according  to  the  most  stringent  and  con¬ 
servative  practices  [4].  The  resource 
needs  of  each  hard  real-time  component 
are  determined  through  careful  theoreti¬ 
cal  analysis  of  the  worst-case  central 
processing  unit  (CPU)  time  and  memory 
consumption  along  every  worst-case 
path  though  the  code.  On  modern  CPU 
architectures,  this  results  in  very  conser¬ 
vative  use  of  computing  resources. 

Sofi  real-time  systems  comply  with  real¬ 
time  constraints  using  empirical  rather 
than  analytical  techniques  [4].  Character¬ 
izing  each  component’s  resource  needs 


statistically,  developers  use  probability 
theory  to  assess  the  likelihood  that  a  sys¬ 
tem  integrating  multiple  independently 
analyzed  components  will  meet  all  of  its 
real-time  constraints.  Since  soft  real-time 
systems  are  not  proven  with  100  percent 
certainty  to  always  satisfy  all  real-time 
constraints,  soft  real-time  developers 
must  design  and  implement  contingency 
mechanisms  to  deal  with  the  occasional 
missed  deadline. 

The  term  safetg-critical  describes  soft¬ 
ware  systems  that  must  be  certified  to 
the  satisfaction  of  government  regulato¬ 
ry  auditors  because  human  lives  may  be 
lost  if  the  software  malfunctions  [5]. 
Such  software  plays  critical  roles  in  com¬ 
mercial  avionics,  passenger  rail  systems, 
nuclear  power  plants,  and  certain  med¬ 
ical  equipment.  The  rigor  of  safety  cer¬ 
tification  calls  for  extremely  conserva¬ 
tive  development  practices.  As  described 
in  this  article,  safety-critical  Java  is  a 
proper  subset  of  hard  real-time  Java 
technologies. 

Using  the  Java  Programming 
Language  for  Real-Time 
Development 

The  appeal  of  Java  derives  from 
improved  developer  productivity, 
reduced  software  maintenance  costs, 
higher  software  reliability,  enhanced 
functionality,  and  improved  generality, 
all  of  which  lead  to  expanded  software 
longevity.  In  the  traditional  business 
information  processing  marketplace 
(financial  record  keeping,  customer  rela¬ 
tions  management,  inventory  controls, 
billing,  payroll,  etc.),  the  Java  program¬ 
ming  language  has  replaced  C++  as  the 
predominant  programming  language, 
largely  because  Java  programmers  are 
approximately  twice  as  productive  when 
developing  new  code  and  are  five  to  10 
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times  as  productive  during  maintenance 
of  existing  code  [6-8],  Various  real-time 
Java  technologies  extend  these  benefits 
into  embedded  real-time  systems. 

Much  of  the  content  presented  in 
this  article  derives  from  the  general  rec¬ 
ommendations  available  in  Guidelines  for 
Scalable  Java  Development  of  }Seal-Time 
Systems  [9],  a  document  originally  devel¬ 
oped  to  guide  the  use  of  real-time  Java 
technologies  by  the  European  Space 
Agency.  The  document  details  three  dif¬ 
ferent  approaches  to  the  use  of  real-time 
Java,  each  tailored  to  the  needs  of  a  spe¬ 
cific  audience:  soft  real-time,  hard  real¬ 
time,  and  safety  critical  real-time.  Safe 
and  efficient  mechanisms  make  it  possi¬ 
ble  to  build  complex  systems  comprised 
of  components  implemented  in  each  of 
the  three  different  real-time  Java  profiles. 

An  important  objective  of  this  article 
is  to  help  engineers  understand  the 
trade-offs  in  selecting  between  alterna¬ 
tive  technologies.  Soft  real-time  Java 
technologies  offer  the  greatest  ease  of 
development  and  maintenance  and  pro¬ 
vide  access  to  the  largest  existing  avail¬ 
ability  of  ready-to-use  open-source  and 
COTS  Java  components.  The  more  con¬ 
strained  hard  real-time  and  safety-critical 
Java  technologies  are  more  difficult  for 
programmers  to  use,  but  they  offer 
improved  determinism  and  much  more 
efficient  deployment.  They  also  repre¬ 
sent  a  much  simpler  run-time  environ¬ 
ment,  facilitating  the  creation  of  safety- 
certification  artifacts. 

Developing  Reusable  Soft 
Real-Time  Components 

One  of  the  key  reasons  why  Java  devel¬ 
opers  are  more  productive  than  C  and 
C++  developers  is  because  of  automat¬ 
ic  garbage  collection.  According  to  a 
study  performed  by  Xerox  Palo  Alto 
Research  Center  in  the  early  1980s  [10], 
automatic  garbage  collection  reduces 
programming  efforts  associated  with 
large,  complex  software  systems  by 
approximately  40  percent.  These  bene¬ 
fits  are  amplified  significantly  in  the  Java 
environment  because  automatic  garbage 
collection  is  the  foundation  on  which 
millions  of  lines  of  COTS  software, 
including  all  of  the  standard  Java 
libraries,  are  based.  Removing  garbage 
collection  from  the  Java  language  makes 
it  more  difficult  to  develop  new  software 
and  also  precludes  the  use  of  nearly  all 
existing  Java  library  code. 

However,  the  power  of  garbage  col¬ 
lection  comes  with  a  cost.  Traditional 
Java  implementations  occasionally  pause 


execution  of  all  Java  threads  to  scan 
memory  in  search  of  objects  no  longer 
in  use.  These  pauses  can  last  tens  of  sec¬ 
onds  with  large  memory  heaps.  Memory 
heaps  ranging  from  100  Mbytes  to  mul¬ 
tiple  gigabytes  are  currently  being  used 
in  mission-critical  systems.  The  30-sec- 
ond  garbage  collection  pause  times 
experienced  with  traditional  Java  Virtual 
Machines  (VMs)  are  incompatible  with 
the  real-time  execution  requirements  of 
most  mission-critical  systems.  Special 
real-time  VMs  support  pre-emptible  and 
incremental  garbage  collection.  This 
approach  is  suitable  for  soft  real-time 
systems  with  timing  constraints  as  low  as 
a  few  hundred  microseconds. 

One  of  the  costs  of  automatic 
garbage  collection  is  the  overhead  of 
implementing  shared  protocols  between 

*V^hen  developers 
speak  of  hard  real-time 
software,  they  generally 
expect  that  the  software 
be  proven  to  satisfy  all 
real-time  constraints 
prior  to  execution/* 

application  threads.  Application  threads 
continually  modify  the  way  objects  relate 
to  each  other  within  memory  while 
garbage  collection  threads  continually 
try  to  identify  objects  no  longer  reached 
from  any  threads  in  the  system.  This 
coordination  overhead  is  one  of  the 
main  reasons  that  compiled  Java  pro¬ 
grams  run  at  one  third  to  one  half  the 
speed  of  optimized  C  code. 

The  complexity  of  the  garbage  col¬ 
lection  process  and  of  any  software 
depending  on  garbage  collection  for  reli¬ 
able  execution  is  beyond  the  reach  of 
cost-effective  static  analysis  to  guarantee 
compliance  with  hard  real-time  con¬ 
straints.  Thus,  real-time  garbage  collec¬ 
tion  is  recommended  for  soft  real-time 
but  not  hard  real-time  systems. 

Portable  soft  real-time  components 
should  adhere  to  the  following  guide¬ 
lines: 

1.  Use  standard  edition  Java  Appli¬ 
cation  Programming  Interface  as  this 
provides  access  to  the  most  widely 
available  set  of  built-in  services. 

2.  Instrument  the  component  so  it  can 


determine  its  memory  and  CPU-time 
requirements  on  a  given  deployment 
target. 

3.  Deploy  the  software  on  a  real-time 
VM  that  supports  fixed-priority 
scheduling,  priority  inheritance,  pri¬ 
ority-ordered  queues,  and  real-time 
garbage  collection. 

Software  constructed  according  to  these 
conventions  is  easily  retargeted  and  inte¬ 
grated  into  a  variety  of  soft  real-time 
applications.  Developers  can  perform 
resource  needs  analysis  automatically  as 
part  of  the  dynamic  class  loading 
process  or  manually  as  part  of  the  soft¬ 
ware  maintenance  and  integration  effort. 

Many  soft  real-time  systems  have 
already  been  fielded  using  the  approach¬ 
es  described  in  this  section.  Projects  that 
publicly  acknowledge  their  adoption  of 
real-time  Java  approaches  include  Aegis 
Software  System  Upgrade  [11],  Boeing’s 
J-UCAS  effort  [12],  Calix  C7  Multi- 
Service  Access  System  [13],  the  FELIN 
(Fantassin  a  Equipements  et  Liaisons 
Integres)  wearable  computer  system 
[14],  Nortel’s  high-end,  long-haul  fiber¬ 
optic  switch  [15],  and  Varco’s  robotic  oil 
exploration  system  [16].  Developers 
consistently  find  these  approaches  to 
development  and  maintenance  of  soft 
real-time  software  result  in  significant 
developer  productivity  increases  and 
software  maintenance  cost  savings  in 
comparison  with  the  use  of  C  or  C++. 

Developing  Reusable  Hard 
Real-Time  Components 

When  developers  speak  of  hard  real¬ 
time  software,  they  generally  expect  that 
the  software  be  proven  to  satisfy  all  real¬ 
time  constraints  prior  to  execution. 
Programmers  who  write  hard  real-time 
software  expect  their  programming 
environment  to  support  at  minimum  the 
following: 

1.  Standard  libraries  precisely  con¬ 
strained  by  worst-case  CPU-time 
consumption  and  memory  usage. 

2.  Programming  language  syntactic  fea¬ 
tures  for  which  the  worst-case  CPU¬ 
time  and  memory  usage  of  the 
equivalent  machine-language  transla¬ 
tions  are  precisely  constrained. 

3.  Programming  language  and/or 
library  services  that  allow  developers 
to  speak  in  very  precise  terms  regard¬ 
ing  the  timing  constraints  imposed 
on  particular  real-time  software  com¬ 
ponents. 

4.  Development  tools  to  assist  with  the 
analysis  of  worst-case  CPU  time  and 
memory  requirements  for  particular 
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software  components. 

Additionally,  many  developers  of 
hard  real-time  code  face  additional 
design  constraints  beyond  the  need  to 
prove  compliance  with  real-time  con¬ 
straints.  For  example,  developers  face 
severe  CPU  time,  memory  footprint, 
and  battery  power  constraints;  safety 
certification  requirements;  marketplace 
competition  that  demands  challenging 
functional  upgrades;  interoperability  and 
maintainability  requirements;  aggressive 
completion  schedules;  and  limited  devel¬ 
opment  budgets. 

Enforcing  Static  Properties 

Java  compilers  perform  more  static  prop¬ 
erty  analysis  to  enforce  stronger  consis¬ 
tency  checking  within  software  systems 
than  C  and  C++  compilers.  This  stronger 
consistency  checking  reduces  program¬ 
ming  errors,  further  improving  developer 
productivity.  Examples  of  built-in  static 
property  enforcement  in  Java  include  the 
following: 

1.  Strong  type  checking  prohibits 
incompatible  type  coercion  between, 
for  example,  an  integer  and  a  refer¬ 
ence  type,  or  between  two  incompat¬ 
ible  reference  types. 

2.  Programmers  are  prohibited  from 
adding  integer  values  to  reference 
values  to  create  references  to  new 
objects. 

3.  Byte-code  verification  assures  that 
the  types  of  actual  parameters  passed 
to  a  method  invocation  exactly 
match  the  types  of  the  formal  para¬ 
meters  declared  for  the  invoked 
method. 

4.  When  programmers  write  code  that 
will  throw  an  exception  under  certain 
conditions,  the  Java  compiler 
enforces  that  every  context  that 
invokes  this  code  is  prepared  to  catch 
the  exception.  This  reduces  the  like¬ 
lihood  that  error  conditions  will  be 
overlooked  or  ignored. 

This  consistency  checking  is  especially 
helpful  when  large  software  systems  are 
assembled  from  components  indepen¬ 
dently  produced  by  different  teams  of 
developers. 

Because  of  the  strong  static  property 
checking  already  enforced  by  the  Java 
programming  environment,  Java  has 
attracted  the  attention  of  software  engi¬ 
neers  as  a  possible  language  of  choice 
for  implementation  of  hard  real-time 
and  safety- critical  systems.  Hard  real¬ 
time  software  includes  radar  and  sonar 
systems,  global  positioning  systems, 
software-defined  radios,  and  low-level 
device  drivers  for  various  peripheral 


devices.  Safety-critical  software  includes 
anti-lock  braking  systems  in  consumer 
vehicles,  fly-by-wire  control  of  flight 
surfaces  in  commercial  aircraft,  auto¬ 
matic  shutdown  of  nuclear  power 
plants,  computer-controlled  switching 
systems  in  passenger  railroad  systems, 
and  weapons  fire-control  software. 

To  address  the  needs  of  hard  real¬ 
time  developers,  the  Open  Group  is 
sponsoring  a  Java  community  process 
expert  group  to  establish  standards  for 
safety-critical  development  with  the  Java 
language.  The  Open  Group  is  a  vendor 
and  technology-neutral  consortium  that 
works  to  enable  access  to  integrated 
information  within  and  between  enter¬ 
prises  based  on  open  standards  and 
global  interoperability.  The  intention  is 
to  produce  a  standard  that  is  endorsed 
both  by  the  Java  Community  Process 
and  by  the  International  Organization 
for  Standardization  (ISO).  This  standard 
is  based  on  the  existing  Real-Time 
Specification  for  Java  (RTSJ)  [17].  As  a 
key  contributor  to  this  standardization 
effort,  Aonix  has  drafted  a  set  of  guide¬ 
lines  for  real-time  developers  who  desire 
to  use  the  Java  language  [9]. 


Programmers  who  develop  their  code 
according  to  these  draft  guidelines  for 
development  of  hard  real-time  and  safe¬ 
ty-critical  Java  can  rely  on  assurances 
from  a  special  byte-code  verifier. 

1 .  The  maximum  amount  of  CPU  time 
required  to  execute  particular  meth¬ 
ods  (and  all  overriding  methods)  is 
bounded  by  a  constant  that  can  be 
derived  as  a  static  property  of  the 
program. 

2.  The  maximum  amount  of  stack 
memory  required  to  execute  particu¬ 
lar  methods  (and  all  overriding  meth¬ 
ods)  is  bounded  by  a  constant  that 
can  be  derived  as  a  static  property  of 
the  program. 

3.  Execution  of  a  particular  method  (or 
of  any  overriding  method)  will  not 
allocate  any  memory  in  the  shared 
immortal  heap. 

4.  No  blocking  operations  will  be  at¬ 
tempted  while  a  thread  holds  a  queue- 
free  priority  ceiling  emulation  lock. 

5.  Execution  of  particular  methods  will 
not  result  in  throwing  of  RTSJ -defined 
CeilingViolationException,  Duplicate 
FilterExcepfion,  IllegalAssignment 
Error,  InaccessibleAreaException, 


Figure  1 :  Real-Time  Java  Development  Environment 
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MemoryAccessError,  MemoryScope 
Exception,  MemoryTypeConflictEx- 
ception,  OutOfMemoryError,  Scoped 
CycleException,  StackOverflowError, 
or  ThrowBoundaryError  exceptions. 
Enforcement  of  these  static  properties 
is  provided  by  a  special  byte-code  verifier 
that  enforces  more  stringent  constraints 
than  the  traditional  Java  byte-code  verifier. 

A  proposed  integrated  development 
environment  is  illustrated  in  Figure  1 
(see  page  21).  This  hard  real-time  devel¬ 
opment  environment  builds  upon  the 
popular  open-source  Eclipse  integrated 
development  environment.  COTS  tech¬ 
nologies,  color  coded  in  light  grey,  pro¬ 
vide  the  foundation  of  the  hard  real¬ 
time  development  environment.  Hard 
real-time  plug-ins  to  the  open  Eclipse 
architecture,  color  coded  as  dark  grey, 
provide  special  hard  real-time  develop¬ 
ment  tool  capabilities.  The  Hard  RT 
(Real-Time)  builder  takes  responsibility  for 
determining  which  parts  of  a  large  soft¬ 
ware  system  need  to  be  retranslated  by 
the  Eclipse  japac  program  and  reverified 
by  the  Hard  RT  perifier  each  time  a  pro¬ 
grammer  modifies  existing  source  code. 
Errors  detected  by  either  Eclipse  japac  or 
by  the  Hard  RT  Verifier  are  highlighted 
within  the  Eclipse  Java-syntax-directed 
editing  window,  providing  immediate 
developer  feedback,  and  simplifying  the 
development  process. 

The  functional  behavior  of  hard 
real-time  Java  code  can  be  exercised  and 
debugged  using  a  traditional  Java  5.0 
run-time  environment.  To  test  the  hard 
real-time  Java  code  on  the  target  hard¬ 
ware,  the  verified  Java  class  file  is  trans¬ 
lated  by  the  Hard  RT  translator  to  C  code. 
Then  it  is  compiled  and  linked  with  run¬ 
time  libraries  using  COTS  C-language 
development  tools.  C  compilers  are 
available  to  support  all  popular  embed¬ 
ded  processor  architectures  and  real¬ 
time  operating  systems  (OSs).  The  C 
code  generated  by  the  Hard  RT  translator 
can  be  loaded  directly  into  read-only 
memory  and  can  execute  in  place.  The 
Hard  RT  debu^er  allows  developers  to 
debug  the  executable  hard  real-time  pro¬ 
gram  using  a  familiar  Eclipse  Java 
debugging  environment,  even  though 
the  deployed  code  was  produced  by  C- 
language  development  tools. 

Note  that  the  hard  real-time  Java 
development  environment  translates 
Java  code  to  C  before  deployment.  This 
provides  much  higher  performance  (up 
to  3.5  times  faster  than  comparable  Java 
code  running  with  the  Sun  Microsys¬ 
tems  HotSpot  compiler),  much  smaller 
memory  footprint  (more  than  10  times 


smaller),  and  tighter  real-time  latencies 
(microseconds  vs.  tens  of  seconds)  than 
traditional  Java.  Comparisons  with  the 
performance  of  handwritten  C  code 
reveal  that  this  technology  generally  pro¬ 
duces  code  that  runs  within  35  percent 
(either  faster  or  slower)  of  comparable 
handwritten  C  code.  The  special  hard 
real-time  Java  verifier  automates  the  sta¬ 
tic  analysis  that  must  be  performed  by 
non-standard,  third  party  tools  when 
using  less  structured  languages  like  C 
and  C++.  Because  the  hard  real-time 
Java  verifier  is  tightly  integrated  with 
other  components  of  the  development 
tool  chain,  the  development  and  deploy¬ 
ment  process  is  much  smoother. 
Information  flows  automatically 
between  the  syntax-directed  editor,  the 
Java  compiler,  the  hard  real-time  byte- 

summary,  the 
hard  real-time  version 
of  Java  allocates  all 
objects  on  the 
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current  thread  rather 
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code  verifier,  and  the  C  compiler  that 
produces  the  final  machine-code  transla¬ 
tion  of  the  original  real-time  Java  pro¬ 
gram. 

When  hard  real-time  Java  technologies 
are  used  to  implement  safety-critical  sys¬ 
tems,  the  hard  real-time  Java  verifier 
imposes  additional  constraints  beyond  the 
constraints  that  are  imposed  on  typical 
hard  real-time  software.  These  additional 
constraints,  motivated  by  established  safe¬ 
ty  certification  guidelines,  help  enforce 
that  all  programmers  who  contribute  to 
the  development  of  a  safety-critical  soft¬ 
ware  system  adhere  to  the  same  conserva¬ 
tive  development  practices. 

The  most  popular  alternative 
approach  to  development  of  hard  real¬ 
time  code  involves  the  use  of  Misra  C 
[18].  In  comparison  to  Misra  C,  the  hard 
real-time  Java  approach  offers  superior 
portability,  scalability,  and  maintainabili¬ 
ty.  It  also  provides  much  easier,  more 
reliable,  and  more  efficient  integration 
with  higher  level  Java  software  which  is 


typical  of  the  much  larger  non-real-time 
and  soft  real-time  layers  of  many  mis¬ 
sion-critical  software  hierarchies. 
Another  key  advantage  of  developing 
and  maintaining  hard  real-dme  code 
with  Java  tools  is  that  today’s  graduating 
software  engineering  students  are  much 
more  likely  to  be  proficient  in  Java  than 
in  any  other  programming  language. 
When  compared  with  Misra  C,  the  key 
disadvantages  of  the  hard  real-time  Java 
approach  are  that  the  technology  is 
newer  and  less  familiar  to  established 
safety-critical  domain  experts,  and  the 
hard  real-dme  Java  approach  introduces 
an  additional  abstraction  layer  between 
source  code  and  deployment.  This  addi¬ 
tional  abstraction  layer  contributes  to 
ease  of  software  maintainability,  porta¬ 
bility,  and  software  scalability,  analogous 
to  abstraction  improvements  provided 
by  C  over  assembly  language.  Some 
additional  effort  may  be  required  to 
trace  requirements  through  source  code 
and  multiple  intermediate  code  repre¬ 
sentations  to  the  final  machine  code 
implementation. 

Memory  for  Temporary  Objects 

Since  Java  is  an  object-oriented  program¬ 
ming  language,  all  structured  data  is  repre¬ 
sented  by  objects.  With  a  traditional  Java 
run-time  environment,  all  objects  are  allo¬ 
cated  within  a  region  of  memory  known 
as  the  heap,  and  the  memory  for  these 
objects  is  reclaimed  by  an  automatic 
garbage  collector.  For  hard  real-time 
development,  the  run-time  environment 
does  not  include  an  automatic  garbage 
collector. 

The  hard  real-time  Java  guidelines 
allow  Java  components  to  allocate  objects 
on  the  run-time  stack  using  programming 
constructs  similar  to  those  of  the  C  and 
C++  programming  languages.  Unlike  C 
and  C++,  statically  enforced  programmer 
annotations  guarantee  that  the  use  of 
stack-allocated  memory  does  not  create 
dangling  pointers. 

In  summary,  the  hard  real-time  version 
of  Java  allocates  all  objects  on  the  run¬ 
time  stack  of  the  current  thread  rather 
than  using  a  garbage-collected  heap.  Any 
running  thread  may  spawn  additional 
threads  by  dedicating  a  portion  of  its  stack 
to  represent  the  run-time  stack  of  the 
spawned  thread.  Any  objects  that  need  to 
be  shared  between  multiple  threads  must 
be  allocated  within  the  stack  of  some 
ancestor  thread. 

The  key  benefits  of  safely  allocating 
temporary  objects  on  the  run-time  stack 
are  the  following: 

•  Temporary  memory  allocation  and 
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deallocation  is  very  fast. 

•  Temporary  memory  allocation  is  very 
reliable,  because  the  stack  never 
becomes  fragmented,  the  maximum 
stack  size  can  be  determined  through 
static  analysis,  and  the  absence  of  dan¬ 
gling  pointers  is  guaranteed  through 
the  use  of  enforceable  programmer 
annotations. 

For  hard  real-time  software,  this  gives  Java 
developers  capabilities  and  performance 
comparable  to  more  traditional  languages 
like  Ada,  C,  and  C++. 

Synergy  Between  Java 
Technologies 

The  embedded  real-time  market  has  been 
described  as  a  thousand  different  niches. 
Each  critical  software  component  repre¬ 
sents  different  requirements  and  econom¬ 
ic  trade-offs.  Figure  2  provides  a  decision 
tree  to  assist  system  architects  in  deciding 
how  to  implement  particular  capabilities. 

Notably,  soft  real-time  Java  develop¬ 
ment  guidelines  are  generally  preferred 
unless  there  is  a  speciftc  constraint  that 
precludes  them  because  soft  real-time  Java 
offers  the  highest  developer  and  software 
maintenance  productivity. 

The  hard  real-time  Java  technologies 
do  not  use  automatic  garbage  collection. 
Instead,  dynamic  memory  is  allocated  and 
deallocated  under  more  explicit  program¬ 
mer  control.  The  safety-critical  Java  stan¬ 
dard  supports  a  subset  of  the  hard  real¬ 
time  Java  capabilities. 

Note  that  this  decision  tree  does  not 
distinguish  between  different  security 
requirements.  Security  issues  are  largely 
orthogonal  to  real-time  issues  and  are  not 
the  focus  of  this  article.  Techniques  for 
enforcing  multiple  independent  levels  of 
security  are  compatible  with  hard,  soft, 
and  safety-critical  Java  technologies. 

Selective  Sharing  of  Control  With 
Traditional  Java  Components 

The  recommended  approach  for  pro¬ 
viding  efficient  and  reliable  integration 
of  hard  real-time  components  with  tra¬ 
ditional  Java  components  uses  a  restric¬ 
tive  form  of  object  sharing  between  the 
hard  real-time  and  traditional  Java 
domains.  The  shared  objects  always 
reside  in  the  hard  real-time  domain  and 
do  not  participate  in  garbage  collection. 
Since  these  are  hard  real-time  objects, 
they  are  never  subject  to  relocation. 
This  greatly  simplifies  the  implementa¬ 
tion  and  improves  execution  efficiency. 

We  describe  object  sharing  as  restric¬ 
tive  because  the  traditional  Java  domain 
cannot  see  any  of  the  instance  or  static 


variables  associated  with  the  hard  real¬ 
time  object.  Furthermore,  it  cannot  see 
the  object’s  regular  methods.  It  can  only 
see  methods  specially  designated  as  tra¬ 
ditional  Java  methods.  These  traditional 
Java  methods  are  analogous  to  OS  entry 
points  for  application  software.  In  this 
regard,  writing  hard  real-time  Java  code 
is  similar  to  making  modifications  to  an 
operating  system  kernel.  As  with  tradi¬ 
tional  operating  system  design,  greater 
trust  is  placed  in  the  implementers  of 
the  lower-level  OS  software,  and  great 
care  is  taken  to  ensure  that  errors  or 
malicious  intent  of  application  software 
do  not  compromise  the  integrity  of 
lower  level  components. 

In  traditional  OS  design,  invocation 
of  kernel  services  generally  crosses  a 
memory  protection  barrier,  and  hard¬ 
ware  memory  management  units  assure 
that  application  code  cannot  see  or 
modify  kernel  code  and  data  structures. 
The  restrictive  object  sharing  architec¬ 
ture  supports  the  same  abstraction  guar¬ 
antees,  but  it  does  so  using  static  byte¬ 
code  verification  techniques  which 
allow  much  more  efficient  integration 
of  the  hard  real-time  and  non-real-time 
software.  The  performance  benefits  of 
this  architecture  have  already  been 
demonstrated,  for  example,  in  studies 
conducted  by  Calton  Pu  on  the 
Synthesis  Kernel  [19]. 

When  systems  are  comprised  of  a 
combination  of  hard  and  soft  real-time 
components,  the  hard  real-time  compo¬ 
nents  win  typically  run  with  footprint 
and  throughput  efficiency  very  close  to 
that  of  optimized  C.  This  represents  a 
three-fold  improvement  over  typical 
optimized  Java  performance  and  foot¬ 
print.  There  are  many  important  mis¬ 
sion-critical  needs  that  can  be  addressed 
by  this  configuration,  such  as  the  fol¬ 
lowing: 

•  Portable  and  very  efficient  device 
drivers  (possibly,  but  not  necessarily, 
having  hard  real-time  constraints). 

•  Compared  with  the  use  of  the  Java 
Native  Interface  QNI),  interfaces  to 
legacy  {native)  components  written  in 
other  languages  are  much  more  effi¬ 
cient  and  much  safer  if  implemented 
using  the  hard  real-time  Java  tech¬ 
nologies  as  an  intermediary  between 
traditional  Java  and  the  native  code. 

•  Performance-critical  code  such  as 
Fourier  analysis  and  matrix  manipu¬ 
lation  can  be  provided  much  more 
efficiently  as  hard  real-time  Java 
components  than  as  traditional  Java 
code  or  as  legacy  code  interfaced  to 
Java  through  JNI. 


The  hard  real-time  Java  approaches 
described  here  are  only  now  becoming 
commercially  available,  so  current  expe¬ 
rience  is  limited  to  research  experi¬ 
ments.  A  number  of  commercial  and 
defense  projects  are  beginning  develop¬ 
ment  based  on  these  technologies. 
Information  should  be  available  in 
another  year  or  so. 

Using  Off-the-Shelf 
Components  in  Real-Time 
Systems 

The  key  to  using  traditional  off-the-shelf 
Java  components  in  real-time  systems 
involves  careful  partitioning  of  capabili¬ 
ties  to  ensure  that  reliable  operation  of 
real-time  components  is  not  compromised 
by  the  less-discipUned  behavior  of  non- 
real-time  components.  System  architects 
and  integrators  need  to  evaluate  which 
partitioning  approaches  are  most  appro¬ 
priate  for  each  particular  application’s 
requirements.  Among  the  practices  in 
common  use  today  are  the  following: 

1.  Structure  the  application  so  non-real¬ 
time  code  runs  in  a  different  VM  (pos¬ 
sibly  even  on  a  different  processor) 
than  real-time  code.  By  isolating  the 
non-real-time  code  within  a  distinct 
VM,  it  is  possible  to  enforce  strict 
memory  budgets  and  limit  the  sched- 
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tiling  priorities  at  which  the  non-real¬ 
time  components  consume  CPU  time. 

2.  Allow  selected  non-real-time  compo¬ 
nents  to  run  on  the  same  VM  as  more 
carefully  constructed  soft  real-time 
components  only  after  thorough  test¬ 
ing  of  the  non-real-time  software  to 
establish  sufficient  confidence  that  the 
resource  needs  of  this  non-real-ome 
software  are  well  understood. 

3.  Take  extra  care  to  assure  that  real-time 
software  does  not  have  to  wait  indefi¬ 
nitely  for  interactions  with  non-real¬ 
time  software.  For  example: 

a.  Run  the  non-real-time  software 
within  a  different  VM  than  the 
real-time  software. 

b.  Avoid  blocking  on  synchronization 
for  objects  that  are  shared  between 
non-real-time  and  real-time  com¬ 
ponents. 

c.  OR  make  use  of  alternative  infor¬ 
mation  sources  (e.g.  approxima¬ 
tions)  whenever  a  non-real-time 
component  does  not  deliver  critical 
information  to  the  real-time  com¬ 
ponent  within  the  window  of  time 
that  is  required  in  order  to  satisfy 
end-to-end  timing  constraints. 

4.  Run  components  with  the  most  strin¬ 
gent  real-time  constraints  within  a 
hard  real-time  environment  at  priori¬ 
ties  higher  than  all  soft  real-time  com¬ 
ponents.  The  hard  real-time  environ¬ 
ment  enforces  strict  memory  parti¬ 
tioning  to  prevent  memory  contention 
from  non-real-time  components  run¬ 
ning  in  the  traditional  Java  VM  envi¬ 
ronment. 

These  are  the  approaches  that  have  been 
successfully  deployed  in  the  various  pro¬ 
jects  mentioned  in  [11]  through  [16]. 

Conclusions 

By  restricting  the  use  of  Java  program¬ 
ming  language  features  and  libraries,  and 
by  exploiting  special  static  analysis  tools,  it 
is  possible  to  apply  many  of  the  benefits 
of  the  Java  programming  language  to  the 
specialized  domain  of  real-time  software. 
Standards  to  support  these  development 
approaches  are  being  developed  under  the 
auspices  of  the  open  group,  which  is 
working  to  establish  standards  that  can  be 
endorsed  both  by  the  Java  Community 
Process  and  by  ISO. 

By  carefully  partitioning  functionality 
between  high  and  low-level  software,  it  is 
possible  to  leverage  the  best  strengths  of 
Java  within  each  respective  programming 
domain.  Lower-level  Java  technologies  are 
most  appropriate  for  implementing  low- 
level  device  drivers,  interrupt  handlers, 
safe  and  efficient  interfaces  to  legacy 


(native)  code,  and  certain  performance- 
critical  components  such  as  Fast  Fourier 
transforms.  Fligher-level  Java  technologies 
are  most  appropriate  for  larger,  more 
complex  functionality,  especially  subsys¬ 
tems  that  need  to  support  dynamic  recon¬ 
figuration  and/or  generic  reuse  of  existing 
off-the-shelf  Java  software  components.^ 
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The  Relative  Cost  of  Interchanging,  Adding,  or 
Dropping  Quality  Practices 


Bob  McCann 
hockheed  Martin  Aeronautics 

In  developing  systems  and  software,  there  are  multiple  opportunities  to  perform  qualif  practices  to  find  and  to  fiix  defects  prior 
to  putting  the  system  or  software  into  operations.  This  article  demonstrates  the  following  conclusions:  1)  In  general  quality 
practices  should  be  ordered  by  increasing  average  cost  to  find  and fix  defects.  Fixed  costs  do  not  affect  this  conclusion,  but  sig¬ 
nificant  differences  in  either  defect  detection  effectiveness  or  in  the  effectiveness  of  verifying  rework  induced  defects  can  modify 
the  conclusion.  2)  One  should  retain  or  add  a  second  quality  practice  provided  the  second  practice  fixes  more  defects  during 
rework  than  the  second practice  creates  during  rework,  provided  the  average  cost  to  fix  defects  downstream  is  much  larger  than 
both  the  second  practice’s  fixed  costs  and  the  second  practice’s  marginal  cost  to  find  and fix  defects. 


In  the  beginning  of  a  new  project,  project 
management  gets  to  decide  a  very 
important  issue,  namely  what  work  prod¬ 
ucts  are  subject  to  a  verification  process, 
e.g.,  peer  reviews,  formal  inspection,  test¬ 
ing,  etc.  Some  work  products  may  even  be 
deemed  sufficiently  critical  that  they  are 
subjected  to  multiple  verification  processes 
during  the  development  life  cycle  (e.g, 
requirements  inspection,  design  inspection, 
code  inspection,  code  desk  check,  compile 
and  fix,  informal  peer  reviews  of  various 
kinds,  and  various  flavors  of  testing). 

In  these  cases,  there  is  nearly  always  a 
discussion  of  whether  to  use  an  informal 
peer  review  instead  of  a  formal  inspection 
and  whether  or  not  to  skip  the  pre-com¬ 
pile  desk  check  or  to  perform  the  code 
inspection  before  or  after  the  first  suc¬ 
cessful  compile  or  even  after  the  comple¬ 
tion  of  unit  testing.  What  is  always  present 
is  the  persistent  nagging  feeling  that  too 
much  or  too  little  was  spent  on  verifica¬ 
tion.  This  article,  together  with  the  two 
previous  articles  on  cost  effectiveness  of 
inspections’,  addresses  that  issue  with  a 
simple  quantitative  cost  analysis  model.  It 
should  be  noted  that  this  model  is  easily 
extended  when  a  cause  of  variation  in  any 
of  the  factors  become  known. 

In  what  follows,  statistical  reasoning  is 
used  (where  that  is  not  appropriate,  the 
results  may  differ”).  For  instance,  it  is  high¬ 
ly  unlikely  that  a  compilation  test  will  dis¬ 
cover  a  design  defect  (such  as  poor  choice 
of  algorithm)  that  results  in  a  perfor¬ 
mance  problem.  In  contrast,  it  is  quite 
likely  that  a  formal  inspection  of  the 
design,  a  formal  inspection  of  the  code, 
and  a  formal  performance  test  will  all  have 
a  statistical  likelihood  of  discovering  that 
same  design  defect. 

Warning:  There  may  be  simpler,  more 
elegant  proofs  of  the  above  results  than 
what  follows.  If  algebra  or  statistics  give 
you  a  headache  or  other  trauma,  the 


author  apologizes  for  any  discomfort 
from  what  follows. 

Analysis 

Suppose  three  or  more  adjacent  quality 
practices  that  find  and  fix  defects  are  per¬ 
formed  in  series  (e.g.,  personal  desk  check, 
one-on-one  peer  review,  compile  and  fix, 
formal  inspection,  unit  testing,  etc.’). 
Further  suppose  the  cost  effectiveness  of 
each  is  measurable"'  (please  note  that  cost 
can  be  any  independent  variable  of  inter¬ 
est:  dollars,  labor  hours,  schedule  impact, 
etc.): 

•  Let  Qi  be  quality  practice  j  where y  is  1 , 

2,  ... 

•  Let  Fj  be  the  fixed  and  sunk  costs  of 
quality  practice  Qj’.  Presumably  Fj  will 
be  small  compared  to  other  cost  terms 
if  there  are  a  significant  number  of 
defects. 

•  Let  Cj  be  the  average  cost  per  defect 
found  and  fixed  for  practice  Qj  includ¬ 
ing  verification  practice  Vj. 

•  Let  Eqi  be  the  average  effectiveness  - 
fraction  of  defects  present  found  for 
practice  Qj.  Thus  Eqj^Cj  would  be  the 
probable  cost  of  finding  and  fixing  a 
defect  with  quality  practice  Qj. 

•  Let  Evj  be  the  average  effectiveness  - 
fraction  of  defects  present  found  for 
practice  Vj.  Thus  Evj*Cj  would  be  the 
probable  cost  of  finding  and  fixing  a 
defect  with  verification  practice  Vj. 

•  Let  Ej  be  the  number  of  defects  insert¬ 
ed  by  the  rework  due  to  Qj. 

•  Let  lo  be  the  number  of  defects  inserted 
by  earlier  development  practices.  There 
is  no  breakage  in  this  analysis  if  we  only 
consider  defects  that  are  discovered 
sometime  in  the  product  life  cycle. 
Defects  that  never  get  exercised  have  no 
actual  impact,  just  potential  impact. 

•  Let  IIo  be  the  number  of  defects  enter¬ 
ing  the  second  quality  practice. 

•  Let  Illij  be  the  number  of  defects 


escaping  both  when  Qi  precedes  Qj. 

•  Let  Rj  be  the  rework  activity  associated 
with  quality  practice  Qj. 

•  Let  Vj  be  the  average  effectiveness  of 
the  verification  of  rework  performed 
in  Qj.  For  algebraic  simplicity  we  will 
assume  Vj  is  approximately  equal  to  Ej. 

•  Let  Tlc  be  the  total  life  cycle  cost  with 
subscripts  indicating  what  quality  prac¬ 
tices  are  performed  and  in  which 
order.  Note  that  the  letters  TLC  can 
also  mean  Tender  Loving  Care  through¬ 
out  the  full  life  cycle,  which  is  what 
this  author  thinks  is  necessary  to  com¬ 
ply  with  the  intent  of  section  804  of 
the  Bob  Stump  Act  of  2003‘.  Thus, 
Tlci2  refers  to  the  case  where  Qi  is  per¬ 
formed  before  Q2. 

•  Let  Tlci2  be  the  total  life-cycle  cost 
when  both  Qi  and  Q2  are  present. 

•  Let  Tlci_  be  the  total  life-cycle  cost 
when  Qi  is  present  and  Q2  is  absent. 

•  Let  Tlc_2  be  the  total  life-cycle  cost 
when  only  Q2  is  present  and  Qi  is 
absent. 

Figure  1  shows  the  defect  flow  associated 

with  quality  practice  Qi.  In  Figure  1  the 

circles  represent  tasks,  and  the  boxes  rep¬ 
resent  collections  of  defects. 

•  Qi  is  a  quality  practice  that  finds  a  frac¬ 
tion  Eqi  of  the  defects  present  in  some 
set  of  work  products. 

•  Ri  is  the  task  that  does  the  impact 
analysis  and  repair  for  each  of  the 
identified  defects.  This  in  principle 
may  introduce  a  new  set  Iri  of  defects. 

•  Vi  is  the  verification  task  that  follows 
the  rework  effort.  It  verifies  the  solu¬ 
tions  to  the  defects  discovered  in  Qi 
and  finds  a  fraction  Evi  of  the  new 
defects. 

•  Corrected  defects  do  not  propagate 
further. 

•  Escaped  defects  propagate  to  down¬ 
stream  processes. 

When  we  stack  two  quality  practices  in 
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+  lo*[Cl*(1  -Eq2)*Eq1-C2*(1  -Eqi)*Eq2] 
+  Ci*Ir2*(1-Ev2)*Eq1-C2*Ir1 


a  row,  the  input  to  the  second  practice 
consists  of  the  escaped  defects  from  the 
first  practice.  Algebraically,  this  is  accom¬ 
plished  by  replicating  Figure  1,  changing 
the  subscript  1  to  2  and  replacing  lo  with 
IIo  =  I“*(1-Eqi)  +  Iri*(1-Evi),  as  is  shown 
in  Figure  2. 

Given  the  model  described  by  Figure  1 
and  Figure  2,  it  is  now  algebraically  possi¬ 
ble  to  answer  the  following  questions: 

1.  What  is  the  cost  increase  in  reversing 
the  order  of  application  of  the  two 
practices  assuming  all  downstream 
defects  evenmaUy  create  an  average 
cost  C3  per  defect’? 

Tlci2  =  Fi  +  Ci*(Iri*Evi  +  Io*Eqi)  +  F2  +C2 
*(Ir2*Ev2  +  IIo*Eq2)  +  F3  +  C3III12 
=  Fi  +  Ci*(lm*Evi+lo*EQi) 

+  F2  +  C2*  {Ir2*Ev2  +  [Io*(1-Eqi) 

+  lm*(1-Evi)]*EQ2}  +  F3  +  C3III12 

=  Fl  +  F2  +  Fs  +  Ci*Ir1*Ev1  +  C2*Ir2*Ev2 
+  Ci*Io*Eq1+C2*Io*(1-Eqi)*Eq2 
+  C2*lm*(1-Evi)*EQ2+  C3*llll2 

TlC21  =  Fi  +  F2  +  Fs  +  C2*Ir2*Ev2 
+  Ci*Ir1*Ev1  +  C2*Io*Eq2 
+  Ci*Io*(1-Eq2)*Eq1  +  Ci*Ir2*(1-Ev2) 

*Eq1  +  C3*lll21 

and 

III12  =  Ir2*(1-Ev2)  +  IIo*(1-Eq2) 

=  Ir2*(1-Ev2)  +  [Io*(1-Eqi)  +  Ir1 
*(1-Ev,)]*(1-Eq2) 

III21  =  lm*(1  -Ev,)  +  [lo*(1  -Eq2)  +  Ir2*(1  -Ev2)] 
*(1-Eq,) 

so 

(III2I-III12)  =  lm*EQ2*(1-Ev,)-|R2*EQ1*(1-Ev2) 

thus 

(TlC21-TlC,2)  =  Io*(C2*Eq2-Ci*Eq,) 


*(1-Evi)*Eq2  +  C3*(II|2,-|IIi2) 

=  Io*(C2-Ci)*Eq1*Eq2  +  Ci*Ir2 
*(1-Ev2)*Eq1-C2*Ir1*(1-Ev,) 

*Eq2  +  C3*(lll21-Illl2) 

(TlC21-TlC12)  =  Io*(C2-Ci)*Eq1*Eq2 

+  Ci*Ir2*(1-Ev2)*Eq1-C2 
*Iri*(1-Evi)*Eq2+  C3*[lm*EQ2 
*(1-Ev,)-|r2*Eq1*(1-Ev2)] 

Dividing  this  equation  by  C3*Io*Eqi*Eq2 
and  setting  the  result  to  zero  demonstrates 
the  clarity  of  dimensionless  ratios: 

(TlC21-TlC12)  /C3*Io*Eq1*Eq2  =  0 

or 

(C2-Cl)/C3  +  [(C3-C2)/C3]*(Iri/Io) 
*(1-Evi)/Eq1-[(C3-Ci)/C3]*(Ir2/Io) 

*(1 -Ev2)/Eq2  =  0 

(Equation  1) 

The  solution  to  this  equation  divides  the 
parameter  space  into  two  regions  -  one  in 
which  the  interchange  is  cost  effective  and 
one  in  which  it  is  not.  When  the  left-hand 
side  is  positive,  it  is  more  cost  effective  to 
perform  Qi  before  Q2. 

Although  the  first  term  is  the  one  intu¬ 
ition  would  quickly  identify,  please  note 
that  because  C3  can  be  much  larger  than 
either  Ci  or  C2,  the  second  and  third  terms 
may  dominate  the  outcome,  especially 
during  the  operations  and  maintenance 
phase.  Note  that  the  fixed  cost  contribu¬ 
tions  all  cancel  exactly. 

In  general,  the  quality  practices  should 
be  ordered  by  increasing  average  cost  to 
find  and  fix  defects.  Fixed  costs  do  not 
affect  this  conclusion,  but  significant  dif¬ 
ferences  in  either  defect  detection  effec¬ 
tiveness  or  in  the  effectiveness  of  verify¬ 


ing  rework  induced  defects  can  modify  the 
conclusion. 

2.  What  is  the  cost  of  adding  or  dropping 
a  quality  practice? 

A.  Suppose  we  add  or  drop  Qi: 

TlC12  =  Fi  +  F2  +  Ci*Ir1*Ev1  +  C2*Ir2 

*Ev2  +  Ci*Io*Eq1  +  C2*|0*(1-Eqi)*Eo2 
+  C2*lm*(1-EV1)*EQ2+F3  +  C3*llll2 

where, 

III12  =  Ir2*(1-Ev2)  +  [Io*(1-Eqi) 

+  Iri*(1-Ev,)]*(1-Eq2) 

TlC_2  =  F2  +  C2*(Ir2*Ev2  +  Io*Eq2) 

+  F3  +  03*111.2 

where, 

111.2  =  Io*(1-Eq2)  +  Ir2*(1-Ev2) 

SO 

(111.2-111,2)  =  [Io*Eq,-Iri*(1-Ev,)]*(1-Eq2) 

thus 

(TlC.2-TlC,2)  =  (Io*Eq1  +  Ir1*Evi)*(C2*Eq2-C,) 

+  C3*[Io*Eq,-Ir,*(1-Ev,)]*(1-Eq2) 
-[F,  +  C2*lm*EQ2] 

This  case  will  require  individual  analysis 
using  actual  (or  accurately  estimated)  cost 
performance  data. 

Keeping/  adding  Q,  is  better  if 

(lo*EQ1  +  lm*Evi)*(C2*EQ2-Cl) 

+  C3*[Io*Eq,-Ir,*(1-Ev,)]*(1-Eq2) 

>  [Fi  +  C2*lm*EQ2] 

(Equation  2) 

Indeed,  if  C3  is  sufficiently  large  and  1ri  is 
sufficiently  small,  it  will  always  be  practical 
to  keep/ add  a  quality  practice.  However,  if 
1ri  is  sufficiently  large,  then  the  converse  will 
be  true.  In  this  case,  the  cost  incurred  due  to 
mistakes  inserted  during  rework  swamps 
the  value  of  mistakes  actually  found  and 
fixed.  Under  those  conditions  it  is  better  to 
drop  the  (broken)  quality  practice. 

It  is  also  true  that  very  high  fixed  costs 
can  cause  a  quality  practice  to  become 
impractical.  This  is  especially  true  when 
the  cost  of  primary  concern  is  a  very 
aggressive  development  schedule  commit¬ 
ment.  It  takes  serious  discipline  on  the 
part  of  both  development  management 
and  customer  management  to  put  long 
term  goals  before  short  term  concerns. 
Recent  congressional  and  Department  of 
Defense  efforts  to  emphasize  total  life- 
cycle  costs  appears  to  be  an  attempt  to 
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provide  a  context  in  which  this  long  term 
focus  is  even  possible^ 

B.  Suppose  we  add  or  drop  Q2: 

Tlc,_  =  Fi  +  Ci*(lm*Evi+  Io*Eqi)  +  F3  +  C3*IIIi_ 
Tlci2  =  Fi  +  Ci*(lm*Evi+ Io*Eqi)  +  F2 

+  C2*(Ir2*Ev2  +  IIo*Eq2)  +  F3  +  C3*llll2 


(TlC1_-TlC,2)  =  C3*(IIIi_-|IIi2)-F2-C2 
*(Ir2*Ev2  +  IIo*Eq2) 

but  |||,_  =  No 

so 

(TlC1_-TlC12)  =  C3*{IIo-[IIo*(1-Eq2)+  Ir2 
*(1-Ev2)]}-F2-C2 
*(Ir2*Ev2  +  IIo*Eq2) 

=  C3*{IIo*Eq2-|r2*(1-Ev2)]} 

-C2*(Ir2*Ev2  +  IIo*Eq2)-F2 
=  (C3-C2)*IIo*Eq2  +  (C3-C2) 
*Ev2*Ir2-F2-C3*Ir2 

We  should  retain/ add  Q2  provided  (Tlci_- 
Tlci2)>0.  This  can  be  expressed  as  the  fol¬ 
lowing: 

(C3  -  C2)*IIo*Eq2  +  (C3  -  C2)*Ev2*Ir2 
>  F2  +  C3*Ir2 

or 

llo*Ea2  +  Ir2*Ev2  >  (C3*Ir2  +  F2)/(C3  —  C2) 

(Equation  3) 

Therefore,  one  should  retain/ add  Q2  pro¬ 
vided  the  second  practice  fixes  more 
defects  during  rework  than  the  second 
practice  creates  during  rework  provided  Cs 
is  much  larger  than  either  F2  or  C2. 

Worked  Example 

To  make  the  results  more  solid,  consider  a 
software  development  effort  delivering  a 
million  lines  of  code  over  five  years  by  a 
team  of  100  software  developers.  Given 
that  software  developers  tend  to  change 
jobs  quickly  to  keep  their  skills  current,  one 
can  assume  that  defects  found  during 
design  and  coding  will  be  fixed  and  verified 
by  the  original  author  and  that  defects 
found  late  in  testing  will  be  fixed  and  veri¬ 
fied  by  someone  other  than  the  original 
author.  Fixed  and  sunk  costs  will  be  ignored 
and  some  average  error  rates  and  costs  for 
this  team  of  developers  will  be  guessed: 

•  Twenty  defects  per  thousand  lines  of 
code  inserted  during  coding  and 
design:  lo  =  20,000. 

•  Inspection  catches  three  out  of  every 
our  defects  present:  Eqi  =  0.75. 

•  Three  new  defects  are  created  for 
every  10  fixed:  Iri  —  0.3*0.75*20,000  = 


4,500  (please  note  this  assumes  Iri  is 
proportional  to  lo). 

•  Defect  detection  and  repair  costs 
about  one  labor  hour  each:  Ci  —  1.0. 

•  Rework  detection  catches  nine  out  of 
10  newly  created  defects:  Evi  =  0.9. 

•  Pre-delivery  testing  detects  six  defects 
out  of  10  defects:  Eq2  —  0.6. 

•  Test  rework  inserts  five  new  defects 
for  every  10  fixed  (not  the  original 
author):  Ir2  =  1,635. 

•  Test  rework  detection  catches  six  out 
of  10  newly  created  defects  (not  the 
original  rework  verifier):  Ev2  —  0.6. 

•  Test  detection  and  repair  costs  40 
labor  hours  each:  C2  =  40. 

•  Post  delivery  error  detection  and  repair 
costs  100  labor  hours  each:  Cs  =  100. 

•  Ignore  fixed  and  sunk  costs:  Fi  =  F2  — 

Fj  -  0. 

In  this  case,  spreadsheet  analysis  can  be 
used  to  compute  the  various  costs  in  labor 
hours: 

•  Tlc2i  -  Tlci2  =  318,614  >  0  (do  not  per¬ 
mute  the  inspection  and  testing!). 

•  Tlci_  -  Tlci2  =  91,560  >  0  (do  not  drop 
the  inspection). 

•  Tlc_2  -  Tlci2  =  912,150  >  0  (do  not 
drop  the  testing). 

In  this  case,  permuting  the  practices  raises 
costs,  as  does  dropping  either  practice. 

Just  for  fun,  it  is  now  possible  to  guess 
what  happens  when  the  two  practices  are 
inspection  and  (Qi)  unit  test  (Q2).  Which 
should  one  do  first?  Assume  unit  tests  are 
90  percent  effective  at  finding  defects,  but 
take  four  hours  each  to  find  and  fix  the 
defect  (additional  unit  tests  get  custom- 
built  to  diagnose  and  localize  the  defects) 
and  verification  finds  60  percent  of 
defects  created  by  the  rework: 

•  TlC21  —  TlC12  —  30,821  >  0  (do  the 
inspection  first). 

•  TlC1_  —  TlC12  —  401,556  >  0  (do  not 
drop  the  inspection). 

•  Tlc_2-  Tlcu  ^  178,830  >  0  (do  not 


drop  the  testing). 

This  was  assuming  unit  tests  were  90  per¬ 
cent  effective.  One  can  ask  at  what  unit 
test  effectiveness  the  two  practices  are 
neutral  to  interchange;  in  this  case, 
approximately  55  percent.  In  this  case,  if 
the  unit  tests  are  less  than  55  percent 
effective  at  detecting  defects,  they  should 
be  performed  prior  to  the  inspection. 

Note:  Please  do  not  quote  these 
example  results!  Plug  in  your  own  mea¬ 
surements  and  get  real  answers  to  your 
questions. 

Indeed,  if  we  use  Equation  1,  we  can 
algebraically  solve  for  interchange  neutral¬ 
ity.  On  a  diagram  of  the  parameter  space, 
the  solution  to  this  equation  would  divide 
the  space  into  two  regions,  one  in  which 
the  interchange  is  cost  effective  and  one  in 
which  it  is  not: 

(C2-Cl)/C3  +  [(C3-C2)/C3]*(Iri/Io) 

*(1-Evi)/Eq1-[(C3-Ci)/C3]*(Ir2/Io) 
*(1-Ev2)/Eq2  =  0 

This  can  be  solved  for  1/Eq2  directly: 

1/Eq2  =  {Io/[(1-Ev2)*Ir2]}*{(C2-Ci) 

/(C3-C1)  +  [(C3-C2)/(C3-Ci)]*(lm/lo) 
*(1-Evi)/Eqi} 

(Equation  4) 

provided  the  following  inequality  con¬ 
straint  holds: 

0  <  Eq2  <=  1 

Conclusions 

This  analysis  has  demonstrated  the  follow¬ 
ing  conclusions: 

•  In  general,  the  quality  practices  should 
be  ordered  by  increasing  average  cost 
to  find  and  fix  defects.  Fixed  costs  do 
not  affect  this  conclusion,  but  signifi¬ 
cant  differences  in  either  defect  detec¬ 
tion  effectiveness  or  in  the  effective- 
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ness  of  verifying  rework  induced 
defects  can  modify  the  conclusion. 

•  One  should  retain/add  Q2  provided 
the  second  practice  fixes  more  defects 
during  rework  than  the  second  prac¬ 
tice  creates  during  rework  provided  Cs 
is  much  larger  than  either  F2  or  C2. 

The  Measurement  and 
Traceability  Challenge 

Although  the  above  analysis  does  not 
appeal  to  counter-intuitive  reasoning  as  is 
sometimes  the  case  with  statistical  reason¬ 
ing,  there  is  a  much  more  demanding  bar¬ 
rier  to  benefiting  from  this  analysis:  get¬ 
ting  organizations  to  track  defects  to  their 
origin  and  to  measure  the  associated  costs 
of  finding  and  fixing  them.  One  problem 
is  that  quality  practices  are  not  always  per¬ 
formed  and  measured  the  same  way.  Nor 
do  they  necessarily  define  or  count  defects 
in  the  same  way. 

To  utilize  the  analysis  presented  here, 
each  quality  practice  would  need  to  mea¬ 
sure  the  following  items: 

•  C:  The  average  cost  to  find  and  fix  a 
defect  discovered  during  the  practice. 

•  lo:  The  number  of  defects  inserted 
prior  to  the  practice. 

o  Either  exclude  those  that  are  never 
found  at  all  during  the  product  life 
cycle  -  they  carry  no  actual  cost, 
just  potential  cost. 

o  Or,  use  a  fault  injection  based 
experimental  design  to  estimate  lo 
with  known  accuracy*. 

•  IR:  The  number  of  defects  inserted  dur¬ 
ing  rework  resulting  from  the  practice. 

•  Eq:  The  fraction  of  incoming  defects 
(lo)  found  by  the  quality  practice. 

•  Ev:  The  fraction  of  defects  inserted 
during  rework  (Ir)  found  during  verifi¬ 
cation. 

Even  though  there  are  only  five  items 
to  measure,  it  is  necessary  that  all  quality 
practices  use  the  same  defect  definition  and 
that  aF  defects  get  traced  to  their  point  of 
insertion,  preferably  by  an  automated 
process.  Effective  version  control  and  con¬ 
figuration  management  are  essential  here. 
Further,  the  variable  costs,  sunk  costs,  and 
fixed  costs  used  in  finding  and  fixing  the 
defects  would  need  to  be  capmred.^ 
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Notes 

1.  See  McCann.  “How  Much  Code  In¬ 
spection  Is  Enough?”  Crosstalk 
July  2001,  and  “When  Is  It  Cost  Effec¬ 
tive  to  Use  Formal  Software  Inspec¬ 
tions?”  Crosstalk  Mar.  2004. 

2.  For  this  analysis  to  be  valid,  it  is  neces¬ 
sary  to  have  enough  data  that  the  con¬ 
cepts  of  confidence  interval  and  hypothesis 
testing  are  well  defined  for  the  quanti¬ 
ties  of  interest,  typically  mean  and 
standard  deviation.  In  the  case  of  sin¬ 
gle  humped  distributions,  required 
sample  size  is  proportional  to  the  stan¬ 
dard  deviation  of  the  data.  Statistically 
stable  practices  have  less  variation  than 
statistically  unstable  practices,  so  they 
require  less  data  to  reach  valid  conclu¬ 
sions.  The  exact  number  of  data 
points  depends  on  the  specific  distrib¬ 
ution  being  used.  Analyses  that  is  free 
of  distribution  assumptions  (non- 
parametric  analysis)  typically  take 
more,  not  less  data. 

3.  The  case  of  three  adjacent  practices  is 
sufficient.  The  general  case  can  be 
derived  using  the  same  analytic 
approach,  although  with  a  bit  more 
algebraic  effort.  Please  note  that  if  two 
practices  find  orthogonal  sets  of 
defects,  then  permuting  them  has  no 
effect  on  overall  cost  effectiveness. 

4.  Measurability  of  various  things  will  de¬ 
pend  on  the  process  maturity  of  the 
organization.  Capability  Maturity  Model 
Integration  (CMMI®)  Level  5  organi¬ 
zations  win  routinely  address  informa¬ 
tion  needs  related  to  process  cost  ef¬ 
fectiveness.  CMMI  Level  I  organizations 
win  be  much  less  likely  to  be  able  to  do  so. 

5.  A  fixed  cost  is  one  which  does  not 
grow  in  proportion  to  activity  per¬ 
formed,  see  <http://en.  Wikipedia. 
org/wiki/Fixed_cost>.  A  sunk  cost  is 
one  which  has  already  been  incurred 
and  which  cannot  be  recovered  to  a 
significant  degree,  see  <http://en. 
wikipedia.org/ wiki/Sunk_cost> . 
Although  they  are  quite  different, 
those  differences  are  not  relevant  to 
this  analysis. 

6.  See  Section  804  <www.dod.mil/dodgc/ 
ole/ docs/ 2003NDAA.pdf#search=  % 
22bob%20stump%20act%20of%20 
2003%22>. 

7.  See  for  instance  (current  as  of  8/2006): 

•  <http:/ /lean. mit.edu/index.php? 
option=com_content&task=view 
&id=47&Itemid=57>. 

•  <www.dau.mil/conferences/ 
2005/Wednesday/Bl-1345-Mc 
Elroy-CR73.pdf>. 

•  <wwwl.eere.energy.gov/femp/ 


pdfs /lcc_guide_05  .pdf> . 

•  Executive  Orders  13101  and  13123. 

•  OMB  Circular  A-94. 

o  FAR  Part  7.1,  especially  7.105 
and  Part  52.248-2  (b)  and 
52.248-3. 

•  Defense  Acquisition  Guidebook, 
section  5. 1.3. 5,  “Life  Cycle  Cost 
Optimization”  and  Chapter  3  “Af¬ 
fordability  and  Life-Cycle  Resource 
Estimates.” 

8.  We  can  put  a  lower  bound  on  the 
probable  downstream  cost  of  undis¬ 
covered  defects  actually  discovered 
later  in  the  life  cycle.  Given  defect 
injection  techniques,  it  is  actually  pos¬ 
sible  to  get  statistically  valid  estimates 
of  the  number  of  undiscovered  de¬ 
fects.  This  technique  is  discussed  thor¬ 
oughly  in  Mills,  Harlan  D.  “Statistical 
Validation  of  Computer  Programs” 
and  in  “Software  Productivity.”  Dover 
House  Publishing,  1988. 

9.  If  a  program  collects  enough  data  and 
has  a  repeatable  practice  of  using  sta¬ 
tistical  techniques  in  process  manage¬ 
ment,  then  all  can  be  relaxed  to  the 
idea  of  a  statistically  significant  sample. 
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Software  as  an  Exploitable  Source  of  Intelligence 

Dr.  David  A.  Umphress 
College  of  Aerospace  Doctrine,  Research,  and  education  (CADRE) 

Security  goes  beyond  the  traditional  notion  of  hacking  into  a  piece  of  software.  Software,  even  without  being  installed  or  used, 
can  reveal  compromising  information,  thus  providing  a  security  risk.  This  article  outlines  four  software  exploitation  categories 
that  should  be  considered  before  a  software  product  is  released. 


Our  security  practices  tend  to  separate 
data  from  software.  We  think  of  data  as 
content,  meaning,  something  of  operational 
value  that  can  be  exploited  by  an  adversary. 
We  tend  to  think  of  a  computer  program  as 
something  that  performs  tasks  and  manipu¬ 
lates  data,  not  as  something  that  has  inherent 
informational  value  in  itself  But  we  cannot 
escape  the  simple  fact  that  software  exists;  it 
is  a  collection  of  computer  instructions  and 
supporting  data.  As  such,  it  is  a  thing  some¬ 
thing  that  has  the  potential  to  be  broken  into, 
taken  apart,  scrutinized,  cannibalized  for 
parts,  or  otherwise  used  for  purposes  not 
originally  intended. 

It  is  exactly  this  potential  for  useful  infor¬ 
mation  that  makes  software  attractive  for  soft¬ 
ware  vulnerability  attacks.  Such  attacks  start 
with  software  in  the  same  form  as  it  would 
be  given  to  a  legitimate  user  and  then  sub¬ 
jected  to  a  number  of  static  and  dynamic 
tests  in  order  to  reveal  compromising  infor¬ 
mation.  Software  vulnerability  attacks 
include  the  traditional  notion  of  hacking, 
where  a  computer  is  broken  into  over  a  com¬ 
munication  network,  but  also  encompass  a 
broad  range  of  other  tactics  that  view  soft¬ 
ware  as  a  source  of  intelligence  data.  Attacks 
need  not  necessarily  be  launched  against 
software  systems  that  are  operational.  They 
may  be  more  subtle  in  that  the  attacker  has 
physical  possession  of  the  software  and  is 
examining  it  for  vulnerabilities  under  circum¬ 
stances  that  are  unobserved. 

Forms  of  Exploitation 

Software  vulnerability  attacks  draw  from  the 
work  of  software  security,  reverse  engineer¬ 
ing,  design  reclamation,  and  software  testing 
to  answer  the  question,  what  does  the  product 
reveal  about  itselfl  Unless  explicitly  corrected 
otherwise  during  the  development  phase,  a 
piece  of  software  has  the  potential  to  be 
open  to  four  general  categories  of  exploita¬ 
tion:  intrusion  penetration,  intellectual  prop¬ 
erty  penetration,  component  penetration, 
and  context  penetration.  Offering  a  prescrip¬ 
tion  for  how  to  identify  vulnerabilities  in 
each  of  these  four  categories  is  difficult  — 
and  beyond  the  scope  of  this  article. 
However,  recognizing  that  software  is  open 
to  exploitation  beyond  the  traditional  notion 
of  hacking  is  a  necessary  first  step  toward 


developing  software  processes  which  address 

holistic  security. 

•  Intrusion  penetration  is  the  act  of 

gaining  illicit  use  of  software.  Someone 
engaging  in  intrusion  penetration  would 
seek  to  discover  whether  the  software 
limits  user  access  to  functions  and,  if  so, 
how  securely  the  software  deals  with 
determining  authorization.  Such  a  vul¬ 
nerability  attack  would  analyze  the  soft¬ 
ware  for  how  it  authenticates  users,  how 
-  or  if  -  it  encrypts  data,  what  software 
features  it  allows  users  to  perform,  and 
so  forth.  The  ultimate  goal  of  intrusion 
penetration  is  to  masquerade  as  a  legiti¬ 
mate  user,  thus  gaining  access  to  as  much 
functionality  and  data  as  the  software 
offers,  including,  if  possible,  access  at  the 
level  of  a  super  user. 

•  Intellectual  property  penetration  is 

the  discovery  of  business  rules,  classified 
information,  and  protected  computa¬ 
tions  encoded  in  software.  Consider,  for 
example,  a  software  system  that  process¬ 
es  telemetry  data  from  an  infrared  sensor 
in  order  to  detect  the  heat  signature  of  a 
missile  launch.  The  software  may  well  be 
considered  unclassified  when  stripped  of 
data  relating  to  the  sensor  and  telemetiyf 
stream,  but  the  computer  instructions 
themselves  reveal  how  the  data  is 
processed,  and  can  thus  inadvertently 
reveal  information  that  could  be  exploit¬ 
ed.  Consider  also  a  personnel  database. 
The  structure  of  the  database  alone, 
absent  any  data,  could  reveal  insight  into 
what  information  is  maintained  on  per¬ 
sonnel,  organizational  structure,  maxi¬ 
mum  size  of  organizations,  and  so  forth. 

•  Component  penetration  addresses  how 
the  software  might  be  used  outside  the 
context  for  which  it  was  written.  This 
form  of  penetration  begins  by  discover¬ 
ing  the  individual  components  in  the 
software,  much  as  an  electronics  engineer 
would  identify  distinguishable  compo¬ 
nents  on  a  circuit  board.  Software  com¬ 
ponents  could  then  be  extracted  and 
reused  in  other  software  applications. 
Alternatively,  software  components  could 
be  extracted  and  replaced  with  substitute 
components  that  have  the  same  inter¬ 
faces  but  provide  different  functionality. 


In  the  first  case,  an  adversary  could 
obtain  the  use  of  a  critical  software  ele¬ 
ment  —  such  as  a  decr)ption  algorithm  or 
communication  module  —  without  having 
to  re-create  it  from  scratch.  In  the  second 
case,  the  replaced  component  could 
report  on  the  inner  workings  of  the  soft¬ 
ware,  thus  giving  an  intruder  a  further 
toehold  into  how  the  software  might  be 
further  exploited  programmatically. 
Indeed,  it  is  not  infeasible  that,  under  cer¬ 
tain  circumstances,  an  intruder  could 
intercept  software  being  transmitted  over 
a  network  and  replace  components  with 
ones  having  nefarious  purposes. 

•  Context  penetration  extrapolates  what 
is  known  about  the  software  under 
scrutiny  to  larger  systems  it  may  be  a  part 
of  For  example,  the  network  communi¬ 
cation  traffic  that  a  client  receives, 
processes,  and  transmits  can  reveal  the 
purpose  and  function  of  its  correspond¬ 
ing  server. 

It  should  be  noted  that  none  of  the 
exploitation  categories  noted  above  are  triv¬ 
ial.  Most  must  be  carried  out  manually 
though  laborious  examination;  however, 
minimal  attacks  can  uncover  surprisingly 
revealing  information,  as  evidenced  by  the 
following: 

•  Intrusion  penetration.  A  software 
component  known  as  a  wedge  was  insert¬ 
ed  to  intercept  communication  between 
two  existing  software  components  of  a 
large  military  research  package.  The 
wedge  did  not  interfere  with  the  func¬ 
tioning  of  the  software;  however,  it  dis¬ 
played  the  content  of  data  being  trans¬ 
ferred  from  one  component  to  the  other 
as  the  software  was  being  executed. 
Through  trial  and  error,  the  wedge  was 
successfully  placed  at  a  point  that 
revealed  the  license  key  needed  to 
decrypt  user  information. 

•  Intellectual  property  penetration.  A 
cursory  examination  of  software  used  to 
track  finances  of  a  large  organization 
revealed  the  toll-free  access  phone  num¬ 
ber,  login  name,  and  password  to  the 
organization’s  communication  hub. 
While  this  information  was  not  sufficient 
to  gain  access  to  financial  information,  it 
provided  a  security  hole  through  which 
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an  adversary  could  masquerade  as  a  busi¬ 
ness  unit  and  submit  bogus  data  on 
financial  transactions. 

•  Component  penetration.  The  automat¬ 
ic  software  update  mechanism  was 
extracted  from  a  system  used  to  manage 
membership  information  for  a  national 
non-profit  organization  and  was  trans¬ 
planted  to  another  piece  of  software.  In 
the  original  system,  the  software  refer¬ 
enced  a  local  XML  file  containing  a  URL 
and  the  current  version  numbers  of  the 
individual  software  modules.  Using  the 
URL,  it  obtained  from  the  Web  an  XML 
file  that  described  the  most  recent  ver¬ 
sion  of  each  software  module,  compared 
it  to  the  current  version,  downloaded 
updated  modules,  and  installed  them. 

•  Context  penetration.  The  components 
and  configuration  files  of  the  previous 
system  were  well-enough  named  that 
their  purpose  was  self-evident,  simplify¬ 
ing  the  task  of  isolating  the  automatic 
update  modules  and  adapting  the  config¬ 
uration  files  for  a  totally  different  appli¬ 
cation.  Although  the  server  end  of  the 
update  mechanism  for  the  newly  adapted 
application  had  to  be  constructed  from 
scratch,  it  could  be  modeled  after  the 
original’s  client  and  server  information 
interchange. 

Relevance  to  Today’s 
Technology 

Why  are  software  vulnerability  attacks  rele¬ 
vant  today?  Harvesting  information  from  a 
software  artifact  has  heretofore  been  difficult 


and  time  consuming.  Most  products  of  the 
past  have  been  delivered  as  a  collection  of 
binary  executable  machine  instructions  that 
mask  the  structure  of  the  product  and  do 
not  give  easy  insight  into  how  the  product 
might  be  exploited.  Modern  technologies 
that  have  come  into  heavy  use  in  the  past  five 
years  (specifically  Java  and  .NET)  have 
reshaped  the  product  landscape  by  permit¬ 
ting  software  to  be  encoded  using  generic 
programming  instructions.  Instead  of  having 
the  computer’s  hardware  directly  execute  the 
instructions,  a  special  program  reads  each 
hardware -generic  instruction,  verifies  that  it 
win  not  violate  the  computer’s  security  poli¬ 
cies,  and  carries  it  out  as  if  it  were  part  of  the 
computer’s  instruction  set.  Since  the  encod¬ 
ed  instructions  are  intended  to  be  executed 
on  any  hardware  platform,  they  must  carry 
with  them  information  on  software  module 
structures,  data  types,  etc.  This  approach 
allows  software  to  be  written  once  and  run 
on  many  different  hardware  platforms,  thus 
providing  on-demand  delivery  and  installa¬ 
tion  of  software  to  networked  computers 
(often  a  necessary  piece  of  electronic  com¬ 
merce  and  enterprise  computing).  The  disad¬ 
vantage  is  that  it  does  so  at  the  expense  of 
making  the  software  more  open  to  analysis. 

Conclusion 

Understanding  software  is  a  source  of  intelli¬ 
gence  is  vital  to  anyone  involved  with  devel¬ 
oping  or  distributing  software.  Of  the  four 
exploitation  categories,  only  one,  intrusion 
penetration,  is  tj'pically  examined  in  any 
depth  for  a  given  software  product.  Paying 


attention  to  the  other  categories  wfll  become 
more  important  as  time  goes  on  due  to  the 
increasingly  crucial  role  software  plays  in 
today’s  economy.  It  is  in  any  organization’s 
interest  -  both  industry  and  military  -  to 
minimize  the  amount  of  information 
exposed  by  software  independent  of  it  being 
installed  or  executed.  ♦ 
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COTS:  Commercial  Off-The-Shelf 
or  Custom  Off-The-Shelf? 


COTS.  Everyone  knows  what  this  means,  right? 

Commercial  off-the-shelf  —  something  you  can  walk 
into  a  store  and  buy.  Well,  maybe.  I  was  recently  ponder¬ 
ing  what  this  meant  while  trying  to  define  it  to  a  young 
associate.  As  we  all  know,  defining  things  for  systems  and 
software  engineers  is  never  easy.  There  are  always  more 
options,  parameters,  and  qualifiers.  Let’s  look  at  a  few 
examples  of  commercial  off-the-shelf  purchasing. 

Since  I  drive  an  ancient  rusting  truck,  I’ve  been  fre¬ 
quenting  car  dealerships  to  find  new  transportation.  It  is 
easy  to  walk  by  all  the  shiny  new  cars  on  the  lot  and  think 
jes,  I  can  buy  one  of  these,  right  off  the  lot.  Much  to  my 
dismay,  after  appropriately  discounting  the  asking  price 
and  reconciling  with  my  monthly  budget,  I  needed  some¬ 
thing  not  quite  off  the  shelf.  After  some  negotiations  on 
what  features  I  wanted,  it  became  obvious  that  it  was  best 
that  I  get  more  features  than  I  needed  to  get  just  what  I 
wanted.  Base  car  -t-  package  A  -I-  options  B,  C,  and  D  =  my 
perfect  car,  right  off  the  shelf,  for  a  price,  and  oh,  can  I 
wait  six  weeks  for  it?  I  left  the  dealership  with  my  head 
spinning  from  all  of  the  choices,  still  without  a  COTS  car, 
but  with  my  bank  account  intact. 

After  such  a  grueling  ordeal  at  the  car  lot,  I  decided 
maybe  it  was  time  for  some  lunch  from  my  local  sandwich 
shop.  I  wanted  the  sandwich  in  picture  number  one,  noth¬ 
ing  special,  just  a  basic  sandwich.  What  type  of  bread? 
What  meat?  What  cheese?  What  crunchy  stuff  did  I  want? 
What  dressings?  So  much  for  off-the-shelf.  I  could  have 
chosen  a  number  one  and  then  completely  changed  my 
sandwich  by  making  different  choices  along  the  way.  I 
began  to  think  maybe  the  sandwich  shop  wasn’t  the  best 
place  to  look  for  commercial  standardization.  As  I  left  the 
sandwich  shop,  I  realized  that  commercial  off-the-shelf  might 
not  be  as  standard  as  the  connotation  of  the  phrase 
implies.  It  is  more  like  custom  off-the-shelf 

Of  course,  cars  and  sandwiches  don’t  compare  to 
weapon  systems,  do  they?  Yet  they  are  closer  than  the  mil¬ 
itary  industrial  base  would  like  to  admit.  In  today’s  world, 
the  commercial  world  is  going  towards  more  and  more 
customization  rather  than  a  standard  product  off  the  shelf. 
When  you  bought  your  last  personal  computer,  did  you  go 
buy  one  off  the  shelf  of  the  local  store  or  did  you  go  to  a 
Web  site  and  click  through  pages  and  pages  of  options? 
What  color  of  MP3  player  do  you  want?  Everyday  items 
from  cars  to  toasters  are  now  customized.  It  is  as  if  the 
commercial  industry  realized  that  the  military  got  it  right  — 
customization  is  good. 

What  the  military  really  wants  are  basic  capabilities 
with  lots  of  options  to  customize  their  materiel  at  an 
affordable  price.  The  affordability  is  where  commercial 
industry  surpasses  the  military  industry.  The  additional 
costs  to  customize  my  car  and  my  sandwich  were  minimal. 
The  custom  car  I  almost  bought  would  be  repaired  in  the 
same  shop  as  other  cars  from  the  same  manufacturer,  and 
at  a  standard  labor  rate.  Of  course,  the  military  likes  to 
think  they  have  unique  requirements  over  and  above  those 


required  by  commercial  industry.  While  there  is  some  truth 
to  this,  most  of  the  military  equipment  is  now  coming  up 
to  commercial  standards,  rather  than  commercial  compo¬ 
nents  coming  up  to  military  standards.  While  standards  are 
good,  rarely  does  an  entire  weapon  system  fit  in  any  one 
standard.  Each  component  or  subsystem  may  comply  with 
industry  standards  but  the  components  are  then  custom- 
integrated  into  a  usable  weapon  system. 

So,  why  is  the  military  pursuing  COTS?  I  think  it  is 
because  they  really  want  to  customize  their  purchase,  hop¬ 
ing  that  they  can  get  it  at  affordable  COTS  pricing.  Also, 
they  think  that  COTS  will  speed  up  the  process.  If  there’s 
anything  that  this  issue  of  CROSSTALK  teaches  us,  it’s 
that  COTS  sometimes  saves  neither  time  nor  money. 

See,  customizing  COTS  takes  extra  time  and  extra 
money.  You  want  to  buy  an  off-the-rack  suit?  If  you’re 
exactly  a  44R  coat  and  can  wear  38  pants  with  a  29” 
inseam,  no  problem.  However,  just  the  slightest  change  in 
either  suit  or  pants  size  will  not  only  cost  more,  it  will  dra¬ 
matically  increase  the  out-the-door  time.  Instead  of  wearing 
your  suit  home,  you  have  to  wait  a  week  or  so  for  cus¬ 
tomization  —  same  with  COTS.  You  buy  it  to  save  time  and 
money.  However,  unless  it  fits  you  exactly,  you  will  spend 
additional  time  and  money  getting  it  customized  to  fit  your 
needs.  In  fact,  you  might  also  have  to  customize  the  other 
software  that  the  COTS  interact  with  to  make  it  fit,  this 
costs  even  more  money  and  even  more  time.  In  the 
Department  of  Defense,  sometimes  the  time  is  more 
important  than  the  money  —  and  customizing  COTS  is  typ¬ 
ically  a  very  slow  process. 

So,  how  do  you  want  your  COTS  customized? 

— Wiley  E  Livingston,  Jr.  RE. 

USAfi  580  SMXGlFlight  C  Chief 
wiley.livingston@robins.af.mil 

P.S.  And  if  you  lose  or  gain  weight,  and  your  suit  size  (and 
COTS  needs)  change  . . .  well,  let’s  not  even  go  there! 


Can  You  BACKTalk? 

Here  is  your  chance  to  make  your  point,  even  if  it  is  a  bit 
tongue-in-cheek,  without  your  boss  censoring  your  writing.  In 
addition  to  accepting  articles  that  relate  to  software  engineer¬ 
ing  for  publication  in  CROSSTALK,  we  also  accept  articles  for 
the  BackTalk  column.  BackTalk  articles  should  provide  a 
concise,  clever,  humorous,  and  insightful  perspective  on  the 
software  engineering  profession  or  industry  or  a  portion  of  it. 
Your  BackTalk  article  should  be  entertaining  and  clever  or 
original  in  concept,  design,  or  delivery.  The  length  should  not 
exceed  750  words. 

For  a  complete  author’s  packet  detailing  how  to  submit 
your  BackTalk  article,  visit  our  Web  site  at 
<www.stsc.hill.afmil>. 
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